Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Exploring String and Word Kernels on Croatian-English Parallel Corpus (CROSBI ID 548329)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Jonke, Zeno ; Šilić, Artur ; Dalbelo Bašić, Bojana Exploring String and Word Kernels on Croatian-English Parallel Corpus // Intelligent Systems MIPRO 2009. Rijeka: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO, 2009. str. 308-311

Podaci o odgovornosti

Jonke, Zeno ; Šilić, Artur ; Dalbelo Bašić, Bojana

engleski

Exploring String and Word Kernels on Croatian-English Parallel Corpus

In this paper we investigate classification performance of kernels based document representations, as well as the influence of kernel parameters for text classification in two morphologically different languages. We explore and compare two kernel functions that work at different levels of a sentence. The first is the Gap weighted kernel, a member of the String kernels that operates at the character level and thus compares text documents by subsequences of characters. This removes the need for stemming or lemmatisation, since it captures the stems of the words automatically, which is very important in situations when tools for stemming or lemmatisation are not available. The second method is the Word sequence kernel, an extension of the String kernels that works at the level of the word. This approach provides a more natural representation of the text and has the advantage of reducing document representation, which in turn reduces computation time. These two methods are compared by exploring theirs parameters dependency and by measuring their classification performance for the Croatian-English parallel corpus.

word kernls; string kernels; text classification

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

308-311.

2009.

objavljeno

Podaci o matičnoj publikaciji

Intelligent Systems MIPRO 2009

Rijeka: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO

Podaci o skupu

International Conference MIPRO 2009

predavanje

25.05.2009-29.05.2009

Opatija, Hrvatska

Povezanost rada

Računarstvo