Title |
Statistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension |
Authors |
Takanori Kusumoto and Tomoyosi Akiba |
Abstract |
Statistical machine translation (SMT) requires a parallel corpus between the source and target languages. Although a pivot-translation approach can be applied to a language pair that does not have a parallel corpus directly between them, it requires both source―pivot and pivot―target parallel corpora. We propose a novel approach to apply SMT to a resource-limited source language that has no parallel corpus but has only a word dictionary for the pivot language. The problems with dictionary-based translations lie in their ambiguity and incompleteness. The proposed method uses a word lattice representation of the pivot-language candidates and word lattice decoding to deal with the ambiguity; the lattice expansion is accomplished by using a pivot―target phrase translation table to compensate for the incompleteness. Our experimental evaluation showed that this approach is promising for applying SMT, even when a source-side parallel corpus is lacking. |
Topics |
Machine Translation, SpeechToSpeech Translation, MultiWord Expressions & Collocations, LR Infrastructures and Architectures |
Full paper |
Statistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension |
Bibtex |
@InProceedings{KUSUMOTO12.677,
author = {Takanori Kusumoto and Tomoyosi Akiba}, title = {Statistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension}, booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)}, year = {2012}, month = {may}, date = {23-25}, address = {Istanbul, Turkey}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-7-7}, language = {english} } |