Pseudo-lemmatization in Croatian-English SMT (CROSBI ID 615078)
Prilog sa skupa u časopisu | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Brkić, Marija ; Matetić, Maja ; Seljan, Sanja
engleski
Pseudo-lemmatization in Croatian-English SMT
One of the first difficulties in conducting a thorough analysis of statistical machine translation involving Croatian as a morphologically rich and resource poor language is the lack of quality language resources. This paper presents results of two standard fourteen feature Croatian-English phrase-based statistical machine translation systems. Prior to building the second system a partial pseudo- lemmatization of the Croatian parts of training and test sets is made in an attempt to simplify the translation process. Besides automatic evaluation, a manual evaluation is conducted in order to gain insight into the nature of the translation differences achieved between the two systems.
phrase-based statistical machine translation; pseudolemmatization; Croatian-English
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
242-249.
2014.
nije evidentirano
objavljeno
Podaci o matičnoj publikaciji
Central European conference on information and intelligent systems
Hunjak, T. ; Lovrenčić, S. ; Tomičić, I.
Varaždin: Fakultet organizacije i informatike Sveučilišta u Zagrebu
1847-2001
Podaci o skupu
Central European Conference on Information and Intelligent Systems
predavanje
17.09.2014-19.09.2014
Varaždin, Hrvatska