LREC 2016 Proceedings

Summary of the paper

Title	Crowdsourced Corpus with Entity Salience Annotations
Authors	Milan Dojchinovski, Dinesh Reddy, Tomáš Kliegr, Tomas Vitvar and Harald Sack
Abstract	In this paper, we present a crowdsourced dataset which adds entity salience (importance) annotations to the Reuters-128 dataset, which is subset of Reuters-21578. The dataset is distributed under a free license and publish in the NLP Interchange Format, which fosters interoperability and re-use. We show the potential of the dataset on the task of learning an entity salience classifier and report on the results from several experiments.
Topics	Crowdsourcing, Corpus (Creation, Annotation, etc.), Named Entity Recognition
Full paper	Crowdsourced Corpus with Entity Salience Annotations
Bibtex	@InProceedings{DOJCHINOVSKI16.499, author = {Milan Dojchinovski and Dinesh Reddy and Tomáš Kliegr and Tomas Vitvar and Harald Sack}, title = {Crowdsourced Corpus with Entity Salience Annotations}, booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)}, year = {2016}, month = {may}, date = {23-28}, location = {Portorož, Slovenia}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {978-2-9517408-9-1}, language = {english} }