Title |
Evaluating the Impact of Some Linguistic Information on the Performances of a Similarity-based and Translation-oriented Word-Sense Disambiguation Method |
Authors |
Myriam Rakho and Matthieu Constant |
Abstract |
In this article, we present an experiment of linguistic parameter tuning in the representation of the semantic space of polysemous words. We evaluate quantitatively the influence of some basic linguistic knowledge (lemmas, multi-word expressions, grammatical tags and syntactic relations) on the performances of a similarity-based Word-Sense disambiguation method. The question we try to answer, by this experiment, is which kinds of linguistic knowledge are most useful for the semantic disambiguation of polysemous words, in a multilingual framework. The experiment is about 20 French polysemous words (16 nouns and 4 verbs) and we make use of the French-English part of the sentence-aligned EuroParl Corpus for training and testing. Our results show a strong correlation between the system accuracy and the degree of precision of the linguistic features used, particularly the syntactic dependency relations. Furthermore, the lemma-based approach absolutely outperforms the word form-based approach. The best accuracy achieved by our system amounts to 90%. |
Topics |
Word Sense Disambiguation, Statistical and machine learning methods, MultiWord Expressions & Collocations |
Full paper |
Evaluating the Impact of Some Linguistic Information on the Performances of a Similarity-based and Translation-oriented Word-Sense Disambiguation Method |
Slides |
- |
Bibtex |
@InProceedings{RAKHO10.687,
author = {Myriam Rakho and Matthieu Constant}, title = {Evaluating the Impact of Some Linguistic Information on the Performances of a Similarity-based and Translation-oriented Word-Sense Disambiguation Method}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |