Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Implementation of Croatian NERC system (CROSBI ID 49164)

Prilog u knjizi | izvorni znanstveni rad

Bekavac, Božo ; Tadić, Marko Implementation of Croatian NERC system // Technologies for the Processing and Retrieval of Semi-Structured Documents: Experience from the CADIAL Project / Tadić, Marko ; Dalbelo Bašić, Bojana ; Moens, Marie-Francine (ur.). Zagreb: Hrvatsko društvo za jezične tehnologije, 2009. str. 99-113

Podaci o odgovornosti

Bekavac, Božo ; Tadić, Marko

engleski

Implementation of Croatian NERC system

In this paper a system for Named Entity Recognition and Classification in Croatian language is described. The system is com-posed of the module for sentence segmentation, inflectional lexicon of common words, inflectional lexicon of names and regular local grammars for automatic recognition of numerical and temporal expressions. After the first step (sentence segmentation), the system attaches to each token its full morphosyntactic description and appropriate lemma and additional tags for potential categories for names without dis-ambiguation. The third step (the core of the system) is the application of a set of rules for recognition and classification of named entities in already annotated texts. Rules based on described strategies (like internal and external evidence) are applied in cascade of transducers in defined order. Al-though there are other classification systems for NEs, the results of our system are annotated NEs which are following MUC-7 specification. System is applied on informative and noninformative texts and results are compared. F-measure of the system ap- plied on informative texts yields over 90%.

named entity recognition and classification, Croatian, computational linguistics, information extraction

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

99-113.

objavljeno

Podaci o knjizi

Technologies for the Processing and Retrieval of Semi-Structured Documents: Experience from the CADIAL Project

Tadić, Marko ; Dalbelo Bašić, Bojana ; Moens, Marie-Francine

Zagreb: Hrvatsko društvo za jezične tehnologije

2009.

978-953-55375-1-9

Povezanost rada

Informacijske i komunikacijske znanosti, Filologija