Title | Publicly Available Topic Signatures for all WordNet Nominal Senses |
Author(s) |
Eneko Agirre, Oier Lopez de Lacalle IXA NLP Group, University of the Basque Country |
Session | P10-W |
Abstract | Topic signatures are context vectors built for word senses and concepts. They can be automatically acquired from the web for any concept hierarchy using the ``monosemous relative'' method. Topic signatures have been shown to be useful in Word Sense Disambiguation, for modeling similarity between word senses, classifying new terms in hierarchies and also building hierarchical clusters of word senses for a given word. In this work we present a publicly available resource which comprises both automatically extracted examples for all WordNet 1.6 noun senses and topic signatures built based on those examples. We gathered around 700 sentences per each noun in WordNet. When the monosemous relatives are used to build a sense corpus for polysemous words, they comprise an average of around 3,500 sentences per word sense. The size of the topic signatures thus constructed is of around 4,500 words per word sense. |
Keyword(s) | Topic signatures, WordNet, distributional semantics |
Language(s) | English |
Full Paper | 753.pdf |