SUMMARY : Session O25-WE Machine Translation and Evaluation
Title | Automatic Detection and Semi-Automatic Revision of Non-Machine-Translatable Parts of a Sentence |
---|---|
Authors | K. Uchimoto, N. Hayashida, T. Ishida, H. Isahara |
Abstract | We developed a method for automatically distinguishing the machine-translatable and non-machine-translatable parts of a given sentence for a particular machine translation (MT) system. They can be distinguished by calculating the similarity between a source-language sentence and its back translation for each part of the sentence. The parts with low similarities are highly likely to be non-machine-translatable parts. We showed that the parts of a sentence that are automatically distinguished as non-machine-translatable provide useful information for paraphrasing or revising the sentence in the source language to improve the quality of the translation by the MT system. We also developed a method of providing knowledge useful to effectively paraphrasing or revising the detected non-machine-translatable parts. Two types of knowledge were extracted from the EDR dictionary: one for transforming a lexical entry into an expression used in the definition and the other for conducting the reverse paraphrasing, which transforms an expression found in a definition into the lexical entry. We found that the information provided by the methods helped improve the machine translatability of the originally input sentences. |
Keywords | machine translation, translation aid, machine translatability, similarity, back translation, non-machine-translatable parts, knowledge extraction, paraphrase |
Full paper | Automatic Detection and Semi-Automatic Revision of Non-Machine-Translatable Parts of a Sentence |