Summary of the paper

Title The FAUST Corpus of Adequacy Assessments for Real-World Machine Translation Output
Authors Daniele Pighin, Lluís Màrquez and Lluís Formiga
Abstract We present a corpus consisting of 11,292 real-world English to Spanish automatic translations annotated with relative (ranking) and absolute (adequate/non-adequate) quality assessments. The translation requests, collected through the popular translation portal http://reverso.net, provide a most variated sample of real-world machine translation (MT) usage, from complete sentences to units of one or two words, from well-formed to hardly intelligible texts, from technical documents to colloquial and slang snippets. In this paper, we present 1) a preliminary annotation experiment that we carried out to select the most appropriate quality criterion to be used for these data, 2) a graph-based methodology inspired by Interactive Genetic Algorithms to reduce the annotation effort, and 3) the outcomes of the full-scale annotation experiment, which result in a valuable and original resource for the analysis and characterization of MT-output quality.
Topics Corpus (creation, annotation, etc.), Machine Translation, SpeechToSpeech Translation, Semantics
Full paper The FAUST Corpus of Adequacy Assessments for Real-World Machine Translation Output
Bibtex @InProceedings{PIGHIN12.370,
  author = {Daniele Pighin and Lluís Màrquez and Lluís Formiga},
  title = {The FAUST Corpus of Adequacy Assessments for Real-World Machine Translation Output},
  booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)},
  year = {2012},
  month = {may},
  date = {23-25},
  address = {Istanbul, Turkey},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-7-7},
  language = {english}
 }
Powered by ELDA © 2012 ELDA/ELRA