An Online Syntactic and Semantic Framework for Lexical Relations Extraction Using Natural Language Deterministic Model

Orešković, Marko

izvor podataka: crosbi ✓

An Online Syntactic and Semantic Framework for Lexical Relations Extraction Using Natural Language Deterministic Model (CROSBI ID 426315)

Ocjenski rad | doktorska disertacija

Orešković, Marko An Online Syntactic and Semantic Framework for Lexical Relations Extraction Using Natural Language Deterministic Model / Čubrilo, Mirko ; Essert, Mario (mentor); Zagreb, Fakultet organizacije i informatike, . 2019. doi: 10.13140/RG.2.2.31092.19849

Podaci o odgovornosti

Autori

Orešković, Marko

Mentori

Čubrilo, Mirko ; Essert, Mario

Osnovni podaci na izvornom jeziku
Osnovni podaci na ostalim jezicima

Jezik

engleski

Naslov

An Online Syntactic and Semantic Framework for Lexical Relations Extraction Using Natural Language Deterministic Model

Sažetak

Given the extraordinary growth in online documents, methods for automated extraction of semantic relations became popular, and shortly after, became necessary. This thesis proposes a new deterministic language model, with the associated artifact, which acts as an online Syntactic and Semantic Framework (SSF) for the extraction of morphosyntactic and semantic relations. The model covers all fundamental linguistic fields: Morphology (formation, composition, and word paradigms), Lexicography (storing words and their features in network lexicons), Syntax (the composition of words in meaningful parts: phrases, sentences, and pragmatics), and Semantics (determining the meaning of phrases). To achieve this, a new tagging system with more complex structures was developed. Instead of the commonly used vectored systems, this new tagging system uses tree-like T-structures with hierarchical, grammatical Word of Speech (WOS), and Semantic of Word (SOW) tags. For relations extraction, it was necessary to develop a syntactic (sub)model of language, which ultimately is the foundation for performing semantic analysis. This was achieved by introducing a new `O-structure', which represents the union of WOS/SOW features from T- structures of words and enables the creation of syntagmatic patterns. Such patterns are a powerful mechanism for the extraction of conceptual structures (e.g., metonymies, similes, or metaphors), breaking sentences into main and subordinate clauses, or detection of a sentence’s main construction parts (subject, predicate, and object). Since all program modules are developed as general and generative entities, SSF can be used for any of the Indo- European languages, although validation and network lexicons have been developed for the Croatian language only. The SSF has three types of lexicons (morphs/syllables, words, and multi- word expressions), and the main words lexicon is included in the Global Linguistic Linked Open Data (LLOD) Cloud, allowing interoperability with all other world languages. The SSF model and its artifact represent a complete natural language model which can be used to extract the lexical relations from single sentences, paragraphs, and also from large collections of documents.

Ključne riječi

syntax analysis, semantic analysis, lexical relations extraction, new lexicon types, hierarchical tagset structure, linked open data

Napomena

nije evidentirano

Jezik

nije evidentirano

Naslov

nije evidentirano

Sažetak

nije evidentirano

Ključne riječi

nije evidentirano

Napomena

nije evidentirano

Podaci o izdanju

Broj stranica

237

Datum obrane

15.03.2019.

Status objave rada

obranjeno

DOI

10.13140/RG.2.2.31092.19849

Podaci o ustanovi koja je dodijelila akademski stupanj

Ustanova / Organizacija

Fakultet organizacije i informatike

Mjesto

Zagreb

Povezanost rada

Povezane osobe

Marko Orešković (autor/i)

Mirko Čubrilo (mentor/i)

Mario Essert (mentor/i)

Povezane ustanove

Fakultet organizacije i informatike (016) (autorova ustanova)

Područje

Informacijske i komunikacijske znanosti

Poveznice

doi.org

dx.doi.org

repozitorij.foi.unizg.hr