Guessing the Correct Inflectional Paradigm of Unknown Croatian Words (CROSBI ID 592595)
Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Šnajder, Jan
engleski
Guessing the Correct Inflectional Paradigm of Unknown Croatian Words
A real-life morphological analyzer must be able to handle properly the out-of-vocabulary words. We address the task of guessing the correct inflectional paradigm of unknown Croatian words. We frame this as a supervised machine learning problem: we train a model for deciding whether a candidate lemma-paradigm pair is correct based on a number of string- and corpus-based features. Our aim is to examine the machine learning aspect of the problem: we analyze the features and evaluate the classification accuracy using different feature subsets. We show that satisfactory level of accuracy (92%) can be achieved with SVM using a combination of string- and corpus-based features. We discuss a number of possible directions for future research.
Morfološka analiza; strojno učenje
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
185-190.
2012.
objavljeno
Podaci o matičnoj publikaciji
Proceedings of the Eighth Language Technologies Conference
Erjavec, Tomaž ; Žganec Gros, Jerneja
Ljubljana:
Podaci o skupu
Information Society 2012 - Eighth Language Technologies Conference
predavanje
08.10.2012-09.10.2012
Ljubljana, Slovenija