Title |
A Dataset for Assessing Machine Translation Evaluation Metrics |
Authors |
Lucia Specia, Nicola Cancedda and Marc Dymetman |
Abstract |
We describe a dataset containing 16,000 translations produced by four machine translation systems and manually annotated for quality by professional translators. This dataset can be used in a range of tasks assessing machine translation evaluation metrics, from basic correlation analysis to training and test of machine learning-based metrics. By providing a standard dataset for such tasks, we hope to encourage the development of better MT evaluation metrics. |
Topics |
Corpus (creation, annotation, etc.), Machine Translation, SpeechToSpeech Translation, Statistical and machine learning methods |
Full paper |
A Dataset for Assessing Machine Translation Evaluation Metrics |
Slides |
- |
Bibtex |
@InProceedings{SPECIA10.504,
author = {Lucia Specia and Nicola Cancedda and Marc Dymetman}, title = {A Dataset for Assessing Machine Translation Evaluation Metrics}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |