Title |
Statistical Machine Translation on Paraphrased Corpora |
Authors |
Taro Watanabe (ATR Spoken Language Transaltion Laboratories 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288 JAPAN) Mitsuo Shimohata (ATR Spoken Language Transaltion Laboratories 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288 JAPAN) Eiichiro Sumita (ATR Spoken Language Transaltion Laboratories 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0288 JAPAN) |
Session |
WO20: Machine Translation |
Abstract |
This paper presents a statistical machine translation trained on normalized corpora. The automatic paraphrasing is carried out by inducing paraphrasing expressions from a bilingual corpus. Then, the normalization is treated as a specic paraphrase of a given input determined by the frequency in a corpus. The experimental results on Japanese-to-English translation with normalized English corpus exhibited the reduction of word-error-rate by 8% and the improvement of subjective evaluation from 70% into 72.5%. |
Keywords |
Machine translation |
Full Paper |