Summary of the paper

Title The NewSoMe Corpus: a Unifying Opinion Annotation Framework Across Genres and in Multiple Languages
Authors Roser Saurí, Judith Domingo and Toni Badia
Abstract We present the NewSoMe (News and Social Media) Corpus, a set of subcorpora with annotations on opinion expressions across genres (news reports, blogs, product reviews and tweets) and covering multiple languages (English, Spanish, Catalan and Portuguese). NewSoMe is the result of an effort to increase the opinion corpus resources available in languages other than English, and to build a unifying annotation framework for analyzing opinion in different genres, including controlled text, such as news reports, as well as different types of user generated contents (UGC). Given the broad design of the resource, most of the annotation effort were carried out resorting to crowdsourcing platforms: Amazon Mechanical Turk and CrowdFlower. This created an excellent opportunity to research on the feasibility of crowdsourcing methods for annotating big amounts of text in different languages.
Topics Opinion Mining / Sentiment Analysis, Crowdsourcing
Full paper The NewSoMe Corpus: a Unifying Opinion Annotation Framework Across Genres and in Multiple Languages
Bibtex @InProceedings{SAUR14.350,
  author = {Roser Saurí and Judith Domingo and Toni Badia},
  title = {The NewSoMe Corpus: a Unifying Opinion Annotation Framework Across Genres and in Multiple Languages},
  booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)},
  year = {2014},
  month = {may},
  date = {26-31},
  address = {Reykjavik, Iceland},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-8-4},
  language = {english}
 }
Powered by ELDA © 2014 ELDA/ELRA