Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Evaluating Full Lemmatization of Croatian Texts (CROSBI ID 37971)

Prilog u knjizi | izvorni znanstveni rad

Agić, Željko ; Tadić, Marko ; Dovedan, Zdravko Evaluating Full Lemmatization of Croatian Texts // Recent Advances in Intelligent Information Systems / Klopotek, Mieczyslaw ; Przepiorkowski, Adam ; Wierzchon, Slawomir et al. (ur.). Varšava: Academic Publishing House EXIT, 2009. str. 175-184

Podaci o odgovornosti

Agić, Željko ; Tadić, Marko ; Dovedan, Zdravko

engleski

Evaluating Full Lemmatization of Croatian Texts

The paper presents the implementation and evaluation of a module for full lemmatization of Croatian texts. The module implements several lemmatization procedures, all of them based on merging outputs of the previously developed stochastic morphosyntactic tagger CroTag and the inflectional lexicon of Croatian Evaluation of the lemmatization module on two test cases, simulating realistic and ideal operating conditions, provided full lemmatization accuracy scores of 96.96 and 98.15 percent, respectively. It is also shown that a majority of errors in this framework, 57.14 percent in the realistic testing scenario, occur on word forms with external homography. Moreover, approximately 80 percent of all lemmatization errors occur on nouns, adjectives and adverbs in that particular order. Language resources, testing environment and procedure descriptions are provided in the paper along with a discussion of obtained results and possible future research directions.

full lemmatization, morphosyntactic tagging, Croatian language

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

175-184.

objavljeno

Podaci o knjizi

Recent Advances in Intelligent Information Systems

Klopotek, Mieczyslaw ; Przepiorkowski, Adam ; Wierzchon, Slawomir ; Trojanowski, Krzysztof

Varšava: Academic Publishing House EXIT

2009.

978-83-60434-59-8

Povezanost rada

Računarstvo, Informacijske i komunikacijske znanosti, Filologija