crta
Hrvatska znanstvena Sekcija img
bibliografija
3 gif
 Naslovna
 O projektu
 FAQ
 Kontakt
4 gif
Pregledavanje radova
Jednostavno pretraživanje
Napredno pretraživanje
Skupni podaci
Upis novih radova
Upute
Ispravci prijavljenih radova
Ostale bibliografije
Slični projekti
 Bibliografske baze podataka

Pregled bibliografske jedinice broj: 775581

Zbornik radova

Autori: Bago, Petra; Ljubešić, Nikola
Naslov: Using machine learning for language and structure annotation in an 18th century dictionary
( Using machine learning for language and structure annotation in an 18th century dictionary )
Izvornik: Proceedings of the Electronic lexicography in the 21st century 2015 conference / Kosem, Iztok ; Jakubíček, Miloš ; Kallas, Jelena ; Krek, Simon (ur.). - Ljubljana/Brighton : Trojina, Institute for Applied Slovene Studies/Lexical Computing Ltd. , 2015. 427-442 (ISBN: 978-961-93594-3-3).
Skup: Electronic lexicography in the 21st century: linking lexical data in the digital age
Mjesto i datum: Herstmonceux Castle, Ujedinjeno Kraljevstvo Velike Britanije i Sjeverne Irske, 11-13.08.2015.
Ključne riječi: historical dictionaries; language annotation; structure annotation; supervised machine learning
( historical dictionaries; language annotation; structure annotation; supervised machine learning )
Sažetak:
The accessibility of digitized historical texts is increasing, which, consequently, has resulted in a growing interest in applying machine learning methods to enrich this type of content. The need for applying machine learning is even greater than in modern texts given the high level of inconsistency in historical texts even within the same document. In this paper we investigate the application of a supervised structural machine learning method on language and structure annotation of 18th century dictionary entries. Our research is conducted on the first volume of a trilingual dictionary ‘Dizionario italiano–latino–illirico’ (Italian–Latin–Croatian Dictionary) compiled by Ardellio della Bella and printed in Dubrovnik in 1785. We assume that by using this method, we can significantly reduce time for manual annotation and simplify the process for the annotators. We reach accuracy of approximately 98% for language annotation and around 96% for structure annotation. A final experiment on the time gain obtained by pre-annotating the data shows that only correcting the generated labels is roughly five times faster than full manual annotation.
Vrsta sudjelovanja: Predavanje
Vrsta prezentacije u zborniku: Cjeloviti rad (više od 1500 riječi)
Vrsta recenzije: Međunarodna recenzija
Projekt / tema: 130-1301679-1380
Izvorni jezik: eng
Kategorija: Znanstveni
Znanstvena područja:
Informacijske i komunikacijske znanosti
URL Internet adrese: https://elex.link/elex2015/conference-proceedings/paper-28/
Upisao u CROSBI: Petra Bago (pbago@ffzg.hr), 11. Ruj. 2015. u 11:27 sati



Verzija za printanje   za tiskati


upomoc
foot_4