Title |
Machine Translation for Subtitling: A Large-Scale Evaluation |
Authors |
Thierry Etchegoyhen, Lindsay Bywood, Mark Fishel, Panayota Georgakopoulou, Jie Jiang, Gerard Van Loenhout, Arantza Del Pozo, Mirjam Sepesy Maucec, Anja Turner and Martin Volk |
Abstract |
This article describes a large-scale evaluation of the use of Statistical Machine Translation for professional subtitling. The work was carried out within the FP7 EU-funded project SUMAT and involved two rounds of evaluation: a quality evaluation and a measure of productivity gain/loss. We present the SMT systems built for the project and the corpora they were trained on, which combine professionally created and crowd-sourced data. Evaluation goals, methodology and results are presented for the eleven translation pairs that were evaluated by professional subtitlers. Overall, a majority of the machine translated subtitles received good quality ratings. The results were also positive in terms of productivity, with a global gain approaching 40%. We also evaluated the impact of applying quality estimation and filtering of poor MT output, which resulted in higher productivity gains for filtered files as opposed to fully machine-translated files. Finally, we present and discuss feedback from the subtitlers who participated in the evaluation, a key aspect for any eventual adoption of machine translation technology in professional subtitling. |
Topics |
Usability, User Satisfaction, Statistical and Machine Learning Methods |
Full paper |
Machine Translation for Subtitling: A Large-Scale Evaluation |
Bibtex |
@InProceedings{ETCHEGOYHEN14.463,
author = {Thierry Etchegoyhen and Lindsay Bywood and Mark Fishel and Panayota Georgakopoulou and Jie Jiang and Gerard Van Loenhout and Arantza Del Pozo and Mirjam Sepesy Maucec and Anja Turner and Martin Volk}, title = {Machine Translation for Subtitling: A Large-Scale Evaluation}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |