Summary of the paper

Title The REAL Corpus: A Crowd-Sourced Corpus of Human Generated and Evaluated Spatial References to Real-World Urban Scenes
Authors Phil Bartie, William Mackaness, Dimitra Gkatzia and Verena Rieser
Abstract Our interest is in people’s capacity to efficiently and effectively describe geographic objects in urban scenes. The broader ambition is to develop spatial models capable of equivalent functionality able to construct such referring expressions. To that end we present a newly crowd-sourced data set of natural language references to objects anchored in complex urban scenes (In short: The REAL Corpus ― Referring Expressions Anchored Language). The REAL corpus contains a collection of images of real-world urban scenes together with verbal descriptions of target objects generated by humans, paired with data on how successful other people were able to identify the same object based on these descriptions. In total, the corpus contains 32 images with on average 27 descriptions per image and 3 verifications for each description. In addition, the corpus is annotated with a variety of linguistically motivated features. The paper highlights issues posed by collecting data using crowd-sourcing with an unrestricted input format, as well as using real-world urban scenes.
Topics Corpus (Creation, Annotation, etc.), Crowdsourcing, Natural Language Generation
Full paper The REAL Corpus: A Crowd-Sourced Corpus of Human Generated and Evaluated Spatial References to Real-World Urban Scenes
Bibtex @InProceedings{BARTIE16.1035,
  author = {Phil Bartie and William Mackaness and Dimitra Gkatzia and Verena Rieser},
  title = {The REAL Corpus: A Crowd-Sourced Corpus of Human Generated and Evaluated Spatial References to Real-World Urban Scenes},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portorož, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA