Title |
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: Description of the Resource and Reliability Measures |
Authors |
Judith Muzerelle, Anaïs Lefeuvre, Emmanuel Schang, Jean-Yves Antoine, Aurore Pelletier, Denis Maurel, Iris Eshkol and Jeanne Villaneau |
Abstract |
This article presents ANCOR_Centre, a French coreference corpus, available under the Creative Commons Licence. With a size of around 500,000 words, the corpus is large enough to serve the needs of data-driven approaches in NLP and represents one of the largest coreference resources currently available. The corpus focuses exclusively on spoken language, it aims at representing a certain variety of spoken genders. ANCOR_Centre includes anaphora as well as coreference relations which involve nominal and pronominal mentions. The paper describes into details the annotation scheme and the reliability measures computed on the resource. |
Topics |
Anaphora, Coreference, Dialogue |
Full paper |
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: Description of the Resource and Reliability Measures |
Bibtex |
@InProceedings{MUZERELLE14.150,
author = {Judith Muzerelle and Anaïs Lefeuvre and Emmanuel Schang and Jean-Yves Antoine and Aurore Pelletier and Denis Maurel and Iris Eshkol and Jeanne Villaneau}, title = {ANCOR_Centre, a Large Free Spoken French Coreference Corpus: Description of the Resource and Reliability Measures}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |