Problemi obilježavanja elemenata iz stranih jezika u okviru standarda TEI (CROSBI ID 178072)
Prilog u časopisu | pregledni rad (znanstveni) | međunarodna recenzija
Podaci o odgovornosti
Barbarić, Vuk-Tadija ; Halonja, Antun
hrvatski
Problemi obilježavanja elemenata iz stranih jezika u okviru standarda TEI
U radu su identificirani osnovni problemi te je dan širok i primjenjiv teorijski i praktični okvir za prepoznavanje i obilježavanje elemenata iz stranih jezika u Hrvatskome jezičnom korpusu. Posebna pozornost pridana je mogućnostima primjene oznake »foreign« i globalnoga atributa XML:lang u okviru standarda TEI (»Text Encoding Initiative«). Takvo obilježavanje korpusa može pomoći pri izradbi rječnika, preciznije – jednojezičnoga rječnika, a može poslužiti i za mnoga druga, u prvome redu leksička istraživanja.
korpus; Hrvatski jezični korpus; elementi iz stranih jezika; standard TEI; obilježavanje
Under the project Croatian Language Repository of the Institute of Croatian Language and Linguistics the Croatian Language Corpus is being compiled. It consists of a selection of texts dealing with various subject matters and written in various genres of Croatian. It consists of written sources starting from the first period in which the Croatian language standard has been more or less definitely formed, i.e. the second half of the 19 th century and ending with contemporary sources. In their paper the authors focus on the problem of recognition and marking of foreign language elements in the texts which are being prepared for Croatian Language Corpus by means of the computer language for data marking XML within TEI standard. They particularly focus on the possibilities of applying element »foreign« and global attribute XML:lang. As the need for establishing unified criteria for the marking of foreign language elements has arisen, guidelines for solving this problem, especially taking into consideration the usefulness of such a corpus for future linguistic research (e.g. the compilation of dictionaries) as well as objective possibilities, i.e. the input/output ratio, have to be de
engleski
The Problems of Marking of Foreign Language Elements within the TEI Standard
nije evidentirano
corpus; Croatian Language Corpus; foreign language elements; TEI standard; marking
nije evidentirano
Podaci o izdanju
58
2012.
1-17
objavljeno
0449-363X