LREC 2006 - Proceedings sorted by papers

Title	Creating a Large-Scale Arabic to French Statistical MachineTranslation System
Authors	S. Hasan, A. Isbihani, H. Ney
Abstract	In this work, the creation of a large-scale Arabic to French statistical machine translation system is presented. We introduce all necessary steps from corpus aquisition, preprocessing the data to training and optimizing the system and eventual evaluation. Since no corpora existed previously, we collected large amounts of data from the web. Arabic word segmentation was crucial to reduce the overall number of unknown words. We describe the phrase-based SMT system used for training and generation of the translation hypotheses. Results on the second CESTA evaluation campaign are reported. The setting was inthe medical domain. The prototype reaches a favorable BLEU score of40.8%.
Keywords	Statistical Machine Translation, Corpus acquisition and preprocessing, CESTA evaluation
Full paper	Creating a Large-Scale Arabic to French Statistical MachineTranslation System

Title

Creating a Large-Scale Arabic to French Statistical MachineTranslation System

Authors

S. Hasan, A. Isbihani, H. Ney

Abstract

In this work, the creation of a large-scale Arabic to French statistical machine translation system is presented. We introduce all necessary steps from corpus aquisition, preprocessing the data to training and optimizing the system and eventual evaluation. Since no corpora existed previously, we collected large amounts of data from the web. Arabic word segmentation was crucial to reduce the overall number of unknown words. We describe the phrase-based SMT system used for training and generation of the translation hypotheses. Results on the second CESTA evaluation campaign are reported. The setting was inthe medical domain. The prototype reaches a favorable BLEU score of40.8%.

Keywords

Statistical Machine Translation, Corpus acquisition and preprocessing, CESTA evaluation

Full paper

Creating a Large-Scale Arabic to French Statistical MachineTranslation System