Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Unsupervised Topic-Oriented Keyphrase Extraction and its Application to Croatian (CROSBI ID 173792)

Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija

Saratlija, Josip ; Šnajder, Jan ; Dalbelo Bašić, Bojana Unsupervised Topic-Oriented Keyphrase Extraction and its Application to Croatian // Lecture notes in computer science, 6836 (2011), 340-347

Podaci o odgovornosti

Saratlija, Josip ; Šnajder, Jan ; Dalbelo Bašić, Bojana

engleski

Unsupervised Topic-Oriented Keyphrase Extraction and its Application to Croatian

Labeling documents with keyphrases is a tedious and expensive task. Most approaches to automatic keyphrases extraction rely on supervised learning and require manually labeled training data. In this paper we propose a fully unsupervised keyphrase extraction method, differing from the usual generic keyphrase extractor in the manner the keyphrases are formed. Our method begins by building topically related word clusters from which document keywords are selected, and then expands the selected keywords into syntactically valid keyphrases. We evaluate our approach on a Croatian document collection annotated by eight human experts, taking into account the high subjectivity of the keyphrase extraction task. The performance of the proposed method reaches up to F1=44.5, which is outperformed by human annotators, but comparable to a supervised approach.

Information extraction; keyphrase extraction; unsupervised learning; k-means; Croatian language

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

6836

2011.

340-347

objavljeno

0302-9743

Povezanost rada

Računarstvo

Indeksiranost