Title | Calibrating Resource-light Automatic MT Evaluation: A Cheap Approach to Ranking MT Systems by the Usability of their Output |
Author(s) |
Bogdan Babych (1), Debbie Elliott (2), Anthony Hartley (3)
(1) Centre for Translation Studies, University of Leeds, Leeds LS2 9JT, UK, bogdan@comp.leeds.ac.uk; (2) University of Leeds, School of Computing, Leeds LS2 9JT, UK, debe@comp.leeds.ac.uk; (3) Centre for Translation Studies, University of Leeds, Leeds LS2 9JT, UK, a.hartley@leeds.ac.uk |
Session | P25-EW |
Abstract | MT systems are traditionally evaluated with different criteria, such as adequacy and fluency. Automatic evaluation scores are designed to match these quality parameters. In this paper we introduce a novel parameter - usability (or utility) of output, which was found to integrate both fluency and adequacy. We confronted two automated metrics, BLEU and LTV, with new data for which human evaluation scores were also produced; we then measured the agreement between the automated and human evaluation scores. The resources produced in the experiment are available on the authors' website. |
Keyword(s) | machine translation, evaluation, usability, automated methods |
Language(s) | English, French |
Full Paper | 678.pdf |