crta
Hrvatska znanstvena Sekcija img
bibliografija
3 gif
 Naslovna
 O projektu
 FAQ
 Kontakt
4 gif
Pregledavanje radova
Jednostavno pretraživanje
Napredno pretraživanje
Skupni podaci
Upis novih radova
Upute
Ispravci prijavljenih radova
Ostale bibliografije
Slični projekti
 Bibliografske baze podataka

Pregled bibliografske jedinice broj: 326450

Časopis

Autori: Šnajder, Jan; Dalbelo Bašić, Bojana; Tadić, Marko
Naslov: Automatic Acquisition of Inflectional Lexica for Morphological Normalisation
Izvornik: Information Processing & Management (0306-4573) 44 (2008), 5; 1720-1731
Vrsta rada: članak
Ključne riječi: Morphological normalisation; morphological lexicon; lexicon acquisition; inflection; Croatian language; text mining; information retrieval
Sažetak:
Due to natural language morphology, words can take on various morphological forms. Morphological normalisation – often used in information retrieval and text mining systems – conflates morphological variants of a word to a single representative form. In this paper, we describe an approach to lexicon-based inflectional normalisation. This approach is in between stemming and lemmatisation, and is suitable for morphological normalisation of inflectionally complex languages. To eliminate the immense effort required to compile the lexicon by hand, we focus on the problem of acquiring automatically an inflectional morphological lexicon from raw corpora. We propose a convenient and highly expressive morphology representation formalism on which the acquisition procedure is based. Our approach is applied to the morphologically complex Croatian language, but it should be equally applicable to other languages of similar morphological complexity. Experimental results show that our approach can be used to acquire a lexicon whose linguistic quality allows for rather good normalisation performance.
Projekt / tema: 130-1300646-0645, 036-1300646-1986
Izvorni jezik: ENG
Current Contents: DA
Citation Index: DA
Ostale indexne publikacije: Compu-Math Citation Index;Information Science Abstracts;LISA: Library and Information Science Abstracts;PIRA (Packaging, Paper, Printing and Publishing, Imaging and Nonwovens Abstracts);PsychINFO;Zentrallblatt für Mathematik/Mathematical Abstracts
Kategorija: Znanstveni
Znanstvena područja:
Računarstvo,Informacijske i komunikacijske znanosti,Filologija
Tiskani medij: da
URL Internet adrese: http://dx.doi.org/10.1016/j.ipm.2008.03.006
Broj citata:
Altmetric:
DOI: doi:10.1016/j.ipm.2008.03.006
Google Scholar: Automatic Acquisition of Inflectional Lexica for Morphological Normalisation
Upisao u CROSBI: Bojana Dalbelo Bašić (bojana.dalbelo@fer.hr), 26. Ožu. 2008. u 23:26 sati



  Verzija za printanje   za tiskati


upomoc
foot_4