Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

New version of the Croatian National Corpus (CROSBI ID 40615)

Prilog u knjizi | izvorni znanstveni rad

Tadić, Marko New version of the Croatian National Corpus // After Half a Century of Slavonic Natural Language Processing / Hlaváčková, Dana ; Horák, Aleš ; Osolsobě, Klara et al. (ur.). Brno: Masarykova univerzita, 2009. str. 199-205

Podaci o odgovornosti

Tadić, Marko

engleski

New version of the Croatian National Corpus

This contribution presents the new version (v 2.5) of the Croatian National Corpus (HNK). In the beginning it briefly describes the history of collecting HNK and its first two versions. It continues with describing the differences and novelties introduced in this new version: 1) new text samples that bring the existing corpus structure more to the desired ideal ensemble of text types, genres and topics ; 2) lemmatization and full MSD-tagging of the whole corpus. This second update is realized using lemmatizer and MSD-tagger for Croatian described in (Agi`c et al. 2008, Agić et al. 2009a). It achieves results at the level of state-of-art of taggers for other Slavic languages while in lemmatization it offers some novel solutions in its hybrid approach to disambiguation of lemmatization. Lemmatized, MSD-tagged and disambiguated HNK is available for querying through standard client-server architecture Manatee/Bonito. The contribution concludes with future directions for HNK.

corpus, corpus linguistics, Croatian National Corpus, Croatian language

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

199-205.

objavljeno

Podaci o knjizi

After Half a Century of Slavonic Natural Language Processing

Hlaváčková, Dana ; Horák, Aleš ; Osolsobě, Klara ; Rychlý, Pavel

Brno: Masarykova univerzita

2009.

978-80-7399-815-8

Povezanost rada

Filologija