SUMMARY : Session O25-WE Machine Translation and Evaluation
Title | Training a Statistical Machine Translation System without GIZA++ |
---|---|
Authors | A. Mauser, E. Matusov, H. Ney |
Abstract | The IBM Models (Brown et al., 1993) enjoy great popularity in the machine translation community because they offer high quality word alignments and a free implementation is available with the GIZA++ Toolkit (Och and Ney, 2003). Several methods have been developed to overcome the asymmetry of the alignment generated by the IBM Models. A remaining disadvantage, however, is the high model complexity. This paper describes a word alignment training procedure for statistical machine translation that uses a simple and clear statistical model, different from the IBM models. The main idea of the algorithm is to generate a symmetric and monotonic alignment between the target sentence and a permutation graph representing different reorderings of the words in the source sentence. The quality of the generated alignment is shown to be comparable to the standard GIZA++ training in an SMT setup. |
Keywords | |
Full paper | Training a Statistical Machine Translation System without GIZA++ |