Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Distributional Semantics Approach to Detecting Synonyms in Croatian Language (CROSBI ID 590915)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Karan, Mladen ; Šnajder, Jan ; Dalbelo Bašić, Bojana Distributional Semantics Approach to Detecting Synonyms in Croatian Language // Proceedings of the Eighth Language Technologies Conference / Erjavec, Tomaž ; Žganec Gros, Jerneja (ur.). Ljubljana, 2012. str. 111-116

Podaci o odgovornosti

Karan, Mladen ; Šnajder, Jan ; Dalbelo Bašić, Bojana

engleski

Distributional Semantics Approach to Detecting Synonyms in Croatian Language

Identifying synonyms is important for many natural language processing and information retrieval applications. In this paper we address the task of automatically identifying synonyms in Croatian language using distributional semantic models (DSM). We build several DSMs using latent semantic analysis (LSA) and random indexing (RI) on the large hrWaC corpus. We evaluate the models on a dictionarybased similarity test – a set of synonymy questions generated automatically from a machine readable dictionary. Results indicate that LSA models outperform RI models on this task, with accuracy of 68.7%, 68.2%, and 61.6% on nouns, adjectives, and verbs, respectively. We analyze how word frequency and polysemy level affect the performance and discuss common causes of synonym misidentification.

Named Entities ; Extraction ; Classification

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

111-116.

2012.

objavljeno

Podaci o matičnoj publikaciji

Proceedings of the Eighth Language Technologies Conference

Erjavec, Tomaž ; Žganec Gros, Jerneja

Ljubljana:

1581-9973

Podaci o skupu

Information Society 2012 - Eighth Language Technologies Conference

predavanje

08.10.2012-09.10.2012

Ljubljana, Slovenija

Povezanost rada

Računarstvo

Poveznice