Mining textual data in Croatian (CROSBI ID 507514)
Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Dalbelo Bašić, Bojana ; Bereček, Boris ; Cvitaš, Ana
engleski
Mining textual data in Croatian
Business intelligence systems find textual data a very useful source of information. Text processing algorithms and systems in English and other world languages are well developed, which is not the case with Croatian language. This paper explores the applicability of existing systems and examines optimal parameters for Croatian. The quality of input data strongly influences clustering and classification results. Experiments are significantly better run after reducing noise. The impact of input learning set size and dimensionality are also considered. Special preprocessing for Croatian language consists of morphological normalisation, a useful step towards better results. Non-croatian specialised text mining tools are also applicable.
text mining; text classification; clustering; morphological normalisation
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
61-66-x.
2005.
objavljeno
Podaci o matičnoj publikaciji
Proceedings of the XXVIII International Conference MIPRO 2005, Business Intelligence Systems
Baranović, Mirta ; Sandri, Roberto ; Čišić, Dragan ; Hutinski, Željko
Opatija: Hrvatska udruga za informacijsku i komunikacijsku tehnologiju, elektroniku i mikroelektroniku - MIPRO
Podaci o skupu
Business Intelligence Systems - MIPRO 2005
predavanje
30.05.2005-03.06.2005
Opatija, Hrvatska