Title |
Building and Exploiting a Corpus of Dialog Interactions between French Speaking Virtual and Human Agents |
Authors |
Lina M. Rojas-Barahona, Alejandra Lorenzo and Claire Gardent |
Abstract |
We describe the acquisition of a dialog corpus for French based on multi-task human-machine interactions in a serious game setting. We present a tool for data collection that is configurable for multiple games; describe the data collected using this tool and the annotation schema used to annotate it; and report on the results obtained when training a classifier on the annotated data to associate each player turn with a dialog move usable by a rule based dialog manager. The collected data consists of approximately 1250 dialogs, 10454 utterances and 168509 words and will be made freely available to academic and nonprofit research. |
Topics |
Corpus (creation, annotation, etc.), Dialogue, Tools, systems, applications |
Full paper |
Building and Exploiting a Corpus of Dialog Interactions between French Speaking Virtual and Human Agents |
Bibtex |
@InProceedings{ROJASBARAHONA12.505,
author = {Lina M. Rojas-Barahona and Alejandra Lorenzo and Claire Gardent}, title = {Building and Exploiting a Corpus of Dialog Interactions between French Speaking Virtual and Human Agents}, booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)}, year = {2012}, month = {may}, date = {23-25}, address = {Istanbul, Turkey}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-7-7}, language = {english} } |