Title |
Automatic Evaluation Measures for Statistical Machine Translation System Optimization |
Authors |
Arne Mauser, Sasa Hasan and Hermann Ney |
Abstract |
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no single correct translation. In the extreme case, two translations of the same input can have completely different words and sentence structure while still both being perfectly valid. Large projects and competitions for MT research raised the need for reliable and efficient evaluation of MT systems. For the funding side, the obvious motivation is to measure performance and progress of research. This often results in a specific measure or metric taken as primarily evaluation criterion. Do improvements in one measure really lead to improved MT performance? How does a gain in one evaluation metric affect other measures? This paper is going to answer these questions by a number of experiments. |
Language |
|
Topics |
Machine Translation, SpeechToSpeech Translation, Evaluation methodologies, Statistical methods |
Full paper |
Automatic Evaluation Measures for Statistical Machine Translation System Optimization |
Slides |
- |
Bibtex |
@InProceedings{MAUSER08.785,
author = {Arne Mauser, Sasa Hasan and Hermann Ney},
title = {Automatic Evaluation Measures for Statistical Machine Translation System Optimization},
booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
year = {2008},
month = {may},
date = {28-30},
address = {Marrakech, Morocco},
editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
publisher = {European Language Resources Association (ELRA)},
isbn = {2-9517408-4-0},
note = {http://www.lrec-conf.org/proceedings/lrec2008/},
language = {english}
} |