Title |
Building domain specific lexical hierarchies from corpora |
Authors |
Olivier Ferret (CEA - LIST BP 6 18, route du Panorama, 92265 Fontenay-aux-Roses Cedex) Christian Fluhr (CEA - LIST BP 6 18, route du Panorama, 92265 Fontenay-aux-Roses Cedex) Françoise Rousseau-Hans (CEA - DTI Saclay, 91191 Gif-sur-Yvette Cedex) Jean-Luc Simoni (CEA - LIST BP 6 18, route du Panorama, 92265 Fontenay-aux-Roses Cedex) |
Session |
TP1: Terminology |
Abstract |
In this article, we present a new algorithm for building domain specific lexical hierarchies from texts. The basic elements of such a hierarchy are the normalized terms - mono and multi-word terms - extracted from a large corpus by a terminological extractor. The algorithm relies on collocations for representing the meaning of these terms, finding hierarchical relations between them and finally, organizing them into a hierarchy. Moreover, it takes into account the polysemy of terms while it builds the hierarchy. We also present the results of its application on a part of the corpus designed for the ARC A3 of the Francil network and we go through its possible applications. |
Keywords |
Acquisition of lexical resources, Acquisition of semantic resources, Semantic lexicons, Thesaurus building, Acquisition of semantic relations |
Full Paper |