Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Approximate Representation of Textual Documents in the Concept Space (CROSBI ID 131934)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Dobša, Jasminka ; Dalbelo-Bašić, Bojana Approximate Representation of Textual Documents in the Concept Space // Informatica (Ljubljana), 31 (2007), 1; 21-27-x

Podaci o odgovornosti

Dobša, Jasminka ; Dalbelo-Bašić, Bojana

engleski

Approximate Representation of Textual Documents in the Concept Space

In this paper we deal with the problem of addition of new documents in collection when documents are represented in lower dimensional space by concept indexing. Concept indexing (CI) is a method of feature construction that is relying on concept decomposition of term-document matrix. By using CI original representations of documents are projected on the space spread by centroids of clusters, which are called concept vectors. This problem is especially interesting for application on World Wide Web. Proposed methods are tested for the task of information retrieval. Vectors on which the projection is done in the process of dimension reduction are constructed on the basis of representations of all documents in the collection, and computation of the new representations in the space of reduced dimension demands recomputation of concept decomposition. The solution to this problem is the development of methods which will give approximate representation of newly added documents in the space of reduced dimension. In the paper are introduced two methods for addition of new documents in the space of reduced dimension. In the first method there no addition of new index terms and added documents are represented by existing list of index terms, while in the second method list of index terms is extended and representations of documents and concept vectors are extended in dimensions of newly added terms. It is shown that representation of documents by extended list of index terms does not improve performance of information retrieval significantly.

dimensionality reduction; concept decomposition; information retrieval

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

31 (1)

2007.

21-27-x

objavljeno

0350-5596

Povezanost rada

Računarstvo

Indeksiranost