Summary of the paper

Title Transforming Wikipedia into a Large-Scale Fine-Grained Entity Type Corpus
Authors Abbas Ghaddar and Phillippe Langlais
Abstract This paper presents WiFiNE, an English corpus annotated with fine-grained entity types. We propose simple but effective heuristics we applied to English Wikipedia to build a large, high quality, annotated corpus. We evaluate the impact of our corpus on the fine-grained entity typing system of Shimaoka et al. (2017), with 2 manually annotated benchmarks, FIGER (GOLD) and ONTONOTES . We report state-of-the-art performances, with a gain of 0.8 micro F1 score on the former dataset and a gain of 2.7 macro F1 score on the latter one, despite the fact that we employ the same quantity of training data used in previous works. We make our corpus available as a resource for future works.
Topics Named Entity Recognition, Corpus (Creation, Annotation, Etc.)
Full paper Transforming Wikipedia into a Large-Scale Fine-Grained Entity Type Corpus
Bibtex @InProceedings{GHADDAR18.11,
  author = {Abbas Ghaddar and Phillippe Langlais},
  title = "{Transforming Wikipedia into a Large-Scale Fine-Grained Entity Type Corpus}",
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {May 7-12, 2018},
  address = {Miyazaki, Japan},
  editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {979-10-95546-00-9},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA