Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Dimensionality reduction in representation of textual documents (CROSBI ID 546301)

Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija

Dobša, Jasminka Dimensionality reduction in representation of textual documents // 4th Croatian Mathematical Congres, CroMC2008 / Rudolf Scitovski (ur.). Osijek, 2008. str. 24-24

Podaci o odgovornosti

Dobša, Jasminka

engleski

Dimensionality reduction in representation of textual documents

The task of information retrieval is to extract rele- vant documents for a certain query from collection of textual doc- uments. In the representation of documents in the vector space model documents are presented in the high dimensional vector space. Such a representation su&reg ; ; ers from the problems caused by the fact that relations between index terms are neglected. Relevant documents for a user query will be recognized only if there is term matching between query and document. That is why are developed methods of reparametrization which represent documents in the lower dimensional space in which documents on similar topic are clustered even if term pro&macr ; ; les used in them are little bit di&reg ; ; erent. Here are presented two methods of representation of documents in the lower dimensional space: latent semantic indexing and concept indexing. In the latent semantic indexing original representations of documents in the vector space model are projected onto the &macr ; ; rst k left singular vectors, while in the case of concept indexing representations are projected onto the centroids of clusters. Addition of new documents in collection is particular problem. Vectors on which projection is done are constructed on the ba- sis of representation of all documents in the collection, and the computation of the representations of documents added in the col- lection in the space of reduced dimension demands recomputation of SVD decomposition (for latent semantic indexing) and concept decomposition (for concept indexing). The solution to this prob- lem is the development of methods which will give approximate representation of newly added documents in the space of reduced dimension. Possible solutions for approximate representations will be presented.

information retrieval; latent semantic indexing; concept indexing

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

24-24.

2008.

objavljeno

Podaci o matičnoj publikaciji

4th Croatian Mathematical Congres, CroMC2008

Rudolf Scitovski

Osijek:

Podaci o skupu

4th Croatian Mathematical Congres, CroMC2008

predavanje

17.06.2008-20.06.2008

Osijek, Hrvatska

Povezanost rada

Računarstvo, Informacijske i komunikacijske znanosti