Summary of the paper

Title QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages
Authors Arantxa Otegi, Nora Aranberri, António Branco, Jan Hajic, Martin Popel, Kiril Simov, Eneko Agirre, Petya Osenova, Rita Pereira, João Silva and Steven Neale
Abstract This work presents parallel corpora automatically annotated with several NLP tools, including lemma and part-of-speech tagging, named-entity recognition and classification, named-entity disambiguation, word-sense disambiguation, and coreference. The corpora comprise both the well-known Europarl corpus and a domain-specific question-answer troubleshooting corpus on the IT domain. English is common in all parallel corpora, with translations in five languages, namely, Basque, Bulgarian, Czech, Portuguese and Spanish. We describe the annotated corpora and the tools used for annotation, as well as annotation statistics for each language. These new resources are freely available and will help research on semantic processing for machine translation and cross-lingual transfer.
Topics Corpus (Creation, Annotation, etc.), Word Sense Disambiguation, Named Entity Recognition
Full paper QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages
Bibtex @InProceedings{OTEGI16.1012,
  author = {Arantxa Otegi and Nora Aranberri and António Branco and Jan Hajic and Martin Popel and Kiril Simov and Eneko Agirre and Petya Osenova and Rita Pereira and João Silva and Steven Neale},
  title = {QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portorož, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA