SUMMARY : Session O35-T Terminology & Knowledge Acquisition

 

Title Conceptual Vector Learning - Comparing Bootstrapping from a Thesaurus or Induction by Emergence
Authors M. Lafourcade
Abstract In the framework of the Word Sense Disambiguation (WSD) and lexical transfer in Machine Translation (MT), the representation of word meanings is one critical issue. The conceptual vector model aims at representing thematic activations for chunks of text, lexical entries, up to whole documents. Roughly speaking, vectors are supposed to encode ideas associated to words or expressions. In this paper, we first expose the conceptual vectors model and the notions of semantic distance and contextualization between terms. Then, we present in details the text analysis process coupled with conceptual vectors, which is used in text classification, thematic analysis and vector learning. The question we focus on is whether a thesaurus is really needed and desirable for bootstrapping the learning. We conducted two experiments with and without a thesaurus and are exposing here some comparative results. Our contribution is that dimension distribution is done more regularly by an emergent procedure. In other words, the resources are more efficiently exploited with an emergent procedure than with a thesaurus terms (concepts) as listed in a thesaurus somehow relate to their importance in the language but nor to their frequency in usage neither to their power of discrimination or representativeness.
Keywords Semantic Analysis, Conceptual Vectors, Concept Set, Learning by Emergence
Full paper Conceptual Vector Learning - Comparing Bootstrapping from a Thesaurus or Induction by Emergence