Title |
Division of Example Sentences Based on the Meaning of a Target Word Using Semi-Supervised Clustering |
Authors |
Hiroyuki Shinnou and Minoru Sasaki |
Abstract |
In this paper, we describe a system that divides example sentences (data set) into clusters, based on the meaning of the target word, using a semi-supervised clustering technique. In this task, the estimation of the cluster number (the number of the meaning) is critical. Our system primarily concentrates on this aspect. First, a user assigns the system an initial cluster number for the target word. The system then performs general clustering on the data set to obtain small clusters. Next, using constraints given by the user, the system integrates these clusters to obtain the final clustering result. Our system performs this entire procedure with high precision and requiring only a few constraints. In the experiment, we tested the system for 12 Japanese nouns used in the SENSEVAL2 Japanese dictionary task. The experiment proved the effectiveness of our system. In the future, we will improve sentence similarity measurements. |
Language |
Single language |
Topics |
Corpus (creation, annotation, etc.), Word Sense Disambiguation, Tools, systems, applications |
Full paper |
Division of Example Sentences Based on the Meaning of a Target Word Using Semi-Supervised Clustering |
Slides |
- |
Bibtex |
@InProceedings{SHINNOU08.301,
author = {Hiroyuki Shinnou and Minoru Sasaki},
title = {Division of Example Sentences Based on the Meaning of a Target Word Using Semi-Supervised Clustering},
booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
year = {2008},
month = {may},
date = {28-30},
address = {Marrakech, Morocco},
editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
publisher = {European Language Resources Association (ELRA)},
isbn = {2-9517408-4-0},
note = {http://www.lrec-conf.org/proceedings/lrec2008/},
language = {english}
} |