Title |
Active Annotation in the LUNA Italian Corpus of Spontaneous Dialogues |
Authors |
Christian Raymond, Kepa Joseba Rodriguez and Giuseppe Riccardi |
Abstract |
In this paper we present an active approach to annotate with lexical and semantic labels an Italian corpus of conversational human-human and Wizard-of-Oz dialogues. This procedure consists in the use of a machine learner to assist human annotators in the labeling task. The computer assisted process engages human annotators to check and correct the automatic annotation rather than starting the annotation from un-annotated data. The active learning procedure is combined with an annotation error detection to control the reliablity of the annotation. With the goal of converging as fast as possible to reliable automatic annotations minimizing the human effort, we follow the active learning paradigm, which selects for annotation the most informative training examples required to achieve a better level of performance. We show that this procedure allows to quickly converge on correct annotations and thus minimize the cost of human supervision. |
Language |
Single language |
Topics |
Corpus (creation, annotation, etc.), Speech recognition and understanding, Acquisition, Machine Learning |
Full paper |
Active Annotation in the LUNA Italian Corpus of Spontaneous Dialogues |
Slides |
- |
Bibtex |
@InProceedings{RAYMOND08.499,
author = {Christian Raymond, Kepa Joseba Rodriguez and Giuseppe Riccardi},
title = {Active Annotation in the LUNA Italian Corpus of Spontaneous Dialogues},
booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
year = {2008},
month = {may},
date = {28-30},
address = {Marrakech, Morocco},
editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
publisher = {European Language Resources Association (ELRA)},
isbn = {2-9517408-4-0},
note = {http://www.lrec-conf.org/proceedings/lrec2008/},
language = {english}
} |