Building the Croatian National Corpus (CROSBI ID 28285)
Prilog u knjizi | izvorni znanstveni rad
Podaci o odgovornosti
Tadić, Marko
engleski
Building the Croatian National Corpus
The paper presents the work being done so far on the building of the Croatian National Corpus (HNK). It's being collected since 1998 at the Institute of Linguistics, Faculty of Philosophy, University of Zagreb. The size, time-span, its composition and criteria for text selection are being presented. The HNK consists of two parts: 1) 30-million corpus of contemporary Croatian language, 2) Croatian Electronic Textual Archive. The procedures of the corpus mark-up and processing are being discussed. One of the most interesting features of this corpus since its launch in 1998 is its availability for querying through the WWW. The future directions of 30m corpus enlargement to 100m in next few years, enhanced corpus management and querying as well as annotation and processing are being discussed at the end.
Croatian language, Corpus building, Croatian national corpus, Pos tagging
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
441-446-x.
objavljeno
Podaci o knjizi
Third International Conference on Language Resources and Evaluation LREC2002
González Rodriguez, M. ; Suarez Araujo, C. P.
Pariz : Las Palmas de Gran Canaria: European Language Resources Association (ELRA)
2002.
2-9517408-0-8