Summary of the paper

Title Multilingual Corpora with Coreferential Annotation of Person Entities
Authors Marcos Garcia and Pablo Gamallo
Abstract This paper presents three corpora with coreferential annotation of person entities for Portuguese, Galician and Spanish. They contain coreference links between several types of pronouns (including elliptical, possessive, indefinite, demonstrative, relative and personal clitic and non-clitic pronouns) and nominal phrases (including proper nouns). Some statistics have been computed, showing distributional aspects of coreference both in journalistic and in encyclopedic texts. Furthermore, the paper shows the importance of coreference resolution for a task such as Information Extraction, by evaluating the output of an Open Information Extraction system on the annotated corpora. The corpora are freely distributed in two formats: (i) the SemEval-2010 and (ii) the brat rapid annotation tool, so they can be enlarged and improved collaboratively.
Topics Corpus (Creation, Annotation, etc.), Collaborative Resource Construction
Full paper Multilingual Corpora with Coreferential Annotation of Person Entities
Bibtex @InProceedings{GARCIA14.918,
  author = {Marcos Garcia and Pablo Gamallo},
  title = {Multilingual Corpora with Coreferential Annotation of Person Entities},
  booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)},
  year = {2014},
  month = {may},
  date = {26-31},
  address = {Reykjavik, Iceland},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-8-4},
  language = {english}
 }
Powered by ELDA © 2014 ELDA/ELRA