LREC 2000 2nd International Conference on Language Resources & Evaluation | ||||||
Title | Automatic Extraction of Semantic Similarity of Words from Raw Technical Texts |
Authors | Thanopoulos Aristomenis (Wire Communications Laboratory, Electrical & Computer Engineering Dept., University of Patras, 265 00 Rion, Patras, Greece, aristom@wcl.ee.upatras.gr) Fakotakis Nikos (Wire Communications Laboratory, Electrical & Computer Engineering Dept., University of Patras, 261 10 Rion, Patras, Greece, fakotaki@wcl.ee.upatras.gr) Kokkinakis George (Wire Communications Laboratory, Electrical & Computer Engineering Dept., University of Patras, 261 10 Rion, Patras, Greece, gkokkin@wcl.ee.upatras.gr) |
Keywords | Corpus Processing, Lexical Semantics, NLP, Word Clustering |
Session | Session WO8 - Acquisition of Semantic Information |
Full Paper | 302.ps, 302.pdf |
Abstract | In this paper we address the problem of extracting semantic similarity relations between lexical entities based on context similarities as they appear in specialized text corpora. Only general-purpose linguistic tools are utilized in order to achieve portability across domains and languages. Lexical context is extended beyond immediate adjacency but is still confined by clause boundaries. Morfological and collocational information are employed in order to exploit the most of the contextual data. The extracted semantic similarity relations are transformed to semantic clusters which is a primal form of a domain-specific term thesaurus. |