Title |
Study of Word Sense Disambiguation System that uses Contextual Features - Approach of Combining Associative Concept Dictionary and Corpus - |
Authors |
Kyota Tsutsumida, Jun Okamoto, Shun Ishizaki, Makoto Nakatsuji, Akimichi Tanaka and Tadasu Uchiyama |
Abstract |
We propose a Word Sense Disambiguation (WSD) method that accurately classifies ambiguous words to concepts in the Associative Concept Dictionary (ACD) even when the test corpus and the training corpus for WSD are acquired from different domains. Many WSD studies determine the context of the target ambiguous word by analyzing sentences containing the target word. However, they offer poor performance when they are applied to a corpus that differs from the training corpus. One solution is to use associated words that are domain-independently assigned to the concept in ACD; i.e. many users commonly imagine those words against a given concept. Furthermore, by using the associated words of a concept as search queries for a training corpus, our method extracts relevant words, those that are computationally judged to be related to that concept. By checking the frequency of associated words and relevant words that appear near to the target word in a sentence in the test corpus, our method classifies the target word to the concept in ACD. Our evaluation using two different types of corpus demonstrates its good accuracy. |
Topics |
Word Sense Disambiguation, Lexicon, lexical database, Document Classification, Text categorisation |
Full paper |
Study of Word Sense Disambiguation System that uses Contextual Features - Approach of Combining Associative Concept Dictionary and Corpus - |
Slides |
- |
Bibtex |
@InProceedings{TSUTSUMIDA10.192,
author = {Kyota Tsutsumida and Jun Okamoto and Shun Ishizaki and Makoto Nakatsuji and Akimichi Tanaka and Tadasu Uchiyama}, title = {Study of Word Sense Disambiguation System that uses Contextual Features - Approach of Combining Associative Concept Dictionary and Corpus -}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |