Summary of the paper

Title DECODA: a call-centre human-human spoken conversation corpus
Authors Frederic Bechet, Benjamin Maza, Nicolas Bigouroux, Thierry Bazillon, Marc El-Beze, Renato De Mori and Eric Arbillot
Abstract The goal of the DECODA project is to reduce the development cost of Speech Analytics systems by reducing the need for manual annotat ion. This project aims to propose robust speech data mining tools in the framework of call-center monitoring and evaluation, by means of weakl y supervised methods. The applicative framework of the project is the call-center of the RATP (Paris public transport authority). This project tackles two very important open issues in the development of speech mining methods from spontaneous speech recorded in call-centers : robus tness (how to extract relevant information from very noisy and spontaneous speech messages) and weak supervision (how to reduce the annotation effort needed to train and adapt recognition and classification models). This paper describes the DECODA corpus collected at the RATP during the project. We present the different annotation levels performed on the corpus, the methods used to obtain them, as well as some evaluation o f the quality of the annotations produced.
Topics Corpus (creation, annotation, etc.)
Full paper DECODA: a call-centre human-human spoken conversation corpus
Bibtex @InProceedings{BECHET12.684,
  author = {Frederic Bechet and Benjamin Maza and Nicolas Bigouroux and Thierry Bazillon and Marc El-Beze and Renato De Mori and Eric Arbillot},
  title = {DECODA: a call-centre human-human spoken conversation corpus},
  booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)},
  year = {2012},
  month = {may},
  date = {23-25},
  address = {Istanbul, Turkey},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-7-7},
  language = {english}
 }
Powered by ELDA © 2012 ELDA/ELRA