Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Developing a text classifier with constrained development and execution time (CROSBI ID 613314)

Prilog sa skupa u zborniku | izvorni znanstveni rad | međunarodna recenzija

Budiselić, Ivan ; Delač, Goran ; Vladimir, Klemo Developing a text classifier with constrained development and execution time // Proceedings of the 37th International Convention on Information and Communication Technology, Electronics and Microelectronics. Opatija, 2014. str. 1170-1175

Podaci o odgovornosti

Budiselić, Ivan ; Delač, Goran ; Vladimir, Klemo

engleski

Developing a text classifier with constrained development and execution time

The aim of this paper is to show that an accurate and efficient text classifier for relatively simple problem domains can be created in only a few hours of development time. The motivating example discussed in the paper is a recent HackerRank competition problem that tasked competitors with creating a classifier for questions from the popular question and answer platform StackExchange. The paper describes the key components of one solution to this problem, and briefly overviews the naive Bayes classifier that is the basis of the solution. The discussion is focused on feature selection and example representation which were the key challenges to be addressed during the development of this classifier. We also analyze the effect of the number of features on accuracy, training and classification time and the size of the resulting classifier and the representation of the training examples which were all important characteristics for the competition. The described classifier achieved slightly over 89% accuracy on the hidden question set, while the winning submission achieved around 92%.

text classification; development time constraints

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

1170-1175.

2014.

objavljeno

Podaci o matičnoj publikaciji

Proceedings of the 37th International Convention on Information and Communication Technology, Electronics and Microelectronics

Opatija:

Podaci o skupu

37th International Convention on Information and Communication Technology, Electronics and Microelectronics

predavanje

26.05.2014-30.05.2014

Opatija, Hrvatska

Povezanost rada

Računarstvo