Summary of the paper

Title Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation
Authors Wajdi Zaghouani, Nizar Habash, Ossama Obeid, Behrang Mohit, Houda Bouamor and Kemal Oflazer
Abstract We present our guidelines and annotation procedure to create a human corrected machine translated post-edited corpus for the Modern Standard Arabic. Our overarching goal is to use the annotated corpus to develop automatic machine translation post-editing systems for Arabic that can be used to help accelerate the human revision process of translated texts. The creation of any manually annotated corpus usually presents many challenges. In order to address these challenges, we created comprehensive and simplified annotation guidelines which were used by a team of five annotators and one lead annotator. In order to ensure a high annotation agreement between the annotators, multiple training sessions were held and regular inter-annotator agreement measures were performed to check the annotation quality. The created corpus of manual post-edited translations of English to Arabic articles is the largest to date for this language pair.
Topics Corpus (Creation, Annotation, etc.), Evaluation Methodologies, Tools, Systems, Applications
Full paper Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation
Bibtex @InProceedings{ZAGHOUANI16.696,
  author = {Wajdi Zaghouani and Nizar Habash and Ossama Obeid and Behrang Mohit and Houda Bouamor and Kemal Oflazer},
  title = {Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portorož, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA