Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Implementation of Croatian NERC system (CROSBI ID 528580)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Bekavac, Božo ; Tadić, Marko Implementation of Croatian NERC system // Proceedings of the Workshop on Balto-Slavonic Natural Language Processing 2007, Special Theme: Information Extraction and Enabling Technologies / Piskorski, Jakub ; Tanev, Hristo ; Pouliquen, Bruno et al. (ur.). Prag: Association for Computational Linguistics (ACL), 2007. str. 11-18-x

Podaci o odgovornosti

Bekavac, Božo ; Tadić, Marko

engleski

Implementation of Croatian NERC system

In this paper a system for Named Entity Recognition and Classification in Croatian language is described. The system is com-posed of the module for sentence segmen-tation, inflectional lexicon of common words, inflectional lexicon of names and regular local grammars for automatic rec-ognition of numerical and temporal expres-sions. After the first step (sentence segmen-tation), the system attaches to each token its full morphosyntactic description and appropriate lemma and additional tags for potential categories for names without dis-ambiguation. The third step (the core of the system) is the application of a set of rules for recognition and classification of named entities in already annotated texts. Rules based on described strategies (like internal and external evidence) are applied in cas-cade of transducers in defined order. Al-though there are other classification sys-tems for NEs, the results of our system are annotated NEs which are following MUC-7 specification. System is applied on infor-mative and noninformative texts and results are compared. F-measure of the system ap-plied on informative texts yields over 90%.

named entity recognition and classification; Croatian; computational linguistics; information extraction

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

11-18-x.

2007.

nije evidentirano

objavljeno

978-1-932432-88-6

Podaci o matičnoj publikaciji

Proceedings of the Workshop on Balto-Slavonic Natural Language Processing 2007, Special Theme: Information Extraction and Enabling Technologies

Piskorski, Jakub ; Tanev, Hristo ; Pouliquen, Bruno ; Steinberger, Ralf

Prag: Association for Computational Linguistics (ACL)

Podaci o skupu

45th Annual Meeting of the Association of Computational Linguistics (ACL 2007)

predavanje

23.06.2007-30.06.2007

Prag, Češka Republika

Povezanost rada

Informacijske i komunikacijske znanosti, Filologija