Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Query-Driven Indexing for Scalable Peer-to-Peer Text Retrieval (CROSBI ID 528840)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Skobeltsyn, Gleb ; Luu, Toan ; Podnar Žarko, Ivana ; Rajman, Martin ; Aberer, Karl Query-Driven Indexing for Scalable Peer-to-Peer Text Retrieval // Infoscale: the Second International Conference on Scalable Information Systems. New York (NY): The Association for Computing Machinery (ACM), 2007

Podaci o odgovornosti

Skobeltsyn, Gleb ; Luu, Toan ; Podnar Žarko, Ivana ; Rajman, Martin ; Aberer, Karl

engleski

Query-Driven Indexing for Scalable Peer-to-Peer Text Retrieval

We present a query-driven algorithm for the distributed indexing of large document collections within structured P2P networks. To cope with bandwidth consumption that has been identified as the major problem for the standard P2P approach with single term indexing, we leverage a distributed index that stores up to top-k document references only for carefully chosen indexing term combinations. In addition, since the number of possible term combinations extracted from a document collection can be very large, we propose to use query statistics to index only such combinations that are indeed frequently requested by the users. Thus, by avoiding the maintenance of superfluous indexing information, we achieve a substantial reduction in bandwidth and storage. A specific activation mechanism is applied to continuously update the indexing information according to changes in the query distribution, resulting in an efficient, constantly evolving query-driven indexing structure. We show that the size of the index and the generated indexing/retrieval traffic remains manageable even for web-size document collections at the price of a marginal loss in precision for rare queries. Our theoretical analysis and experimental results provide convincing evidence about the feasibility of the query-driven indexing strategy for large scale P2P text retrieval. Moreover, our experiments confirm that the retrieval performance is only slightly lower than the one obtained with state-of-the-art centralized query engines.

P2P; DHT; IR; Query-Driven Indexing; Scalability

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

2007.

objavljeno

Podaci o matičnoj publikaciji

Infoscale: the Second International Conference on Scalable Information Systems

New York (NY): The Association for Computing Machinery (ACM)

Podaci o skupu

Infoscale: the Second International Conference on Scalable Information Systems

predavanje

06.06.2007-08.06.2007

Suzhou, Kina

Povezanost rada

Elektrotehnika, Računarstvo