TermeX: A Tool for Collocation Extraction (CROSBI ID 148260)
Prilog u časopisu | izvorni znanstveni rad | međunarodna recenzija
Podaci o odgovornosti
Delač, Davor ; Krleža, Zoran ; Dalbelo Bašić, Bojana ; Šnajder, Jan ; Šarić, Frane
engleski
TermeX: A Tool for Collocation Extraction
Collocations – word combinations occurring together more often than by chance – have a wide range of NLP applications. Many approaches for automating collocation extraction based on lexical association measures have been proposed in the literature. This paper presents TermeX – a tool for efficient extraction of collocations based on a variety of association measures. TermeX implements POS filtering and lemmatization, and is capable of extracting collocations up to length four. We address trade-offs between high memory consumption and processing speed and propose an efficient implementation. Our implementation allows for processing time linear to corpus size and memory consumption linear to the number of word types.
TermeX; tool; Collocation Extraction
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
nije evidentirano
Podaci o izdanju
5449
2009.
149-157
objavljeno
0302-9743
10.1007/978-3-642-00382-0_12