Summary of the paper

Title TransLiTex: A Parallel Corpus of Translated Literary Texts 
Authors Amel Fraisse and Quoc-Tan Tran
Abstract In this paper, we present our ongoing research work to create a massively parallel corpus of translated literary texts which is useful for applications in computational linguistics, translation studies and cross-linguistic corpus studies. Using a crowdsourcing approach, we identified and collected 29 translations of Mark Twain’s Adventures of Huckleberry Finn published in 23 languages including less-resourced languages. We report on the current status of the corpus, with 5 chapter-aligned translations (English-Dutch, two English-Hungarian, English-Polish and English-Russian). We evaluated the correctness of chapter alignment by computing the percentage of common words between the English version and the translated ones. Results show high percentages that vary between 43% and 64% proving the high correctness of chapter alignment.
Topics Translated Literary Texts, Parallel Corpus
Full paper TransLiTex: A Parallel Corpus of Translated Literary Texts 
Bibtex @InProceedings{FRAISSE18.11,
  author = {Amel Fraisse and Quoc-Tan Tran},
  title = {TransLiTex: A Parallel Corpus of Translated Literary Texts },
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {may},
  date = {7-12},
  location = {Miyazaki, Japan},
  editor = {Erhong Yang and Le Sun},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {979-10-95546-29-0},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA