Outlier detection methods appropriateness in detection of speeders in business surveys (CROSBI ID 639665)
Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija
Podaci o odgovornosti
Žmuk, Berislav
engleski
Outlier detection methods appropriateness in detection of speeders in business surveys
Speeding in survey methodology can be described as a case of very unusually fast providing answers on the survey questions. For speeders or respondents that are speeding is characteristics that they give answers without applying the cognitive process. Consequently, this low respondents’ engagement lead to poor data quality and validity. Because of that the detection and omitting such respondents is crucial for increasing data quality. Awareness of speeding problem became appeared with the rise of web surveys popularity. Namely, at other data collection methods it is very complicated and expensive to measure time needed to complete questionnaire by respondents. On the other hand, the computer technology enabled easy collecting of different data about respondents. The data about respondents or paradata which can be collected are ranging from time needed to answer each question in the questionnaire to information about respondent’s location and device which respondent use to answer the questions. The research question is how to detect presence of speeders in the survey. It is assumed that this would be possible by using different statistical outlier detection methods. So, following graphical methods for outlier detection are used in the paper: dot-plot diagram, scatter diagram, histogram, and box-plot diagram. Furthermore, quantitative methods for outlier detection applied in the paper are: z- score, modified z-score, Dixons’ test, Grubbs’ test, Tietjen-Moore test, Rosners’ or generalized extreme studentized deviate (ESD) test. These outlier detection methods were applied on data about times needed to complete the Croatia business survey which was conducted in 2013 by using web survey approach. In the analysis survey times for 217 enterprises which use statistical methods in their business were observed. Except observing all enterprises together, the analyses were conducted separately for small, medium and large enterprises levels also. The analysis has shown that none of the observed outlier detection methods was able to detect speeders on appropriate and satisfactory way. The main reasons for that it can be found in slowers, who have taken full attention of outlier detection methods, in violated normal distribution assumption, the observed outlier detection methods assume more or less that underlying distribution of survey completion times is normally distributed, and in masking, because of presence more speeders they became invisible to the outlier detection methods. Because of that in the future research existing outlier detection methods must be improved and be adjusted so that they are capable to detect speeders. Introducing brand new speeders detection methods is also a good option for future research.
speeding; outlier detection methods; threshold; statistical methods use; web survey
This work has been partially supported by the Croatian Science Foundation under the project STRENGTHS (project no. 9402, Project period: 2014- 2018).
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o prilogu
95-95.
2016.
objavljeno
Podaci o matičnoj publikaciji
KOI 2016 Book of Abstracts
Scitovski, Rudolf ; Zekić-Sušac, Marijana
Osijek: Hrvatsko društvo za operacijska istraživanja (CRORS)
1849-5141
Podaci o skupu
16th International Conference on Operational Research - KOI 2016
predavanje
27.09.2016-29.09.2016
Osijek, Hrvatska