Summary of the paper

Title Semantically Annotated Snapshot of the English Wikipedia
Authors Jordi Atserias, Hugo Zaragoza, Massimiliano Ciaramita and Giuseppe Attardi
Abstract This paper describes SW1, the first version of a semantically annotated snapshot of the English Wikipedia. In recent years Wikipedia has become a valuable resource for both the Natural Language Processing (NLP) community and the Information Retrieval (IR) community. Although NLP technology for processing Wikipedia already exists, not all researchers and developers have the computational resources to process such a volume of information. Moreover, the use of different versions of Wikipedia processed differently might make it difficult to compare results. The aim of this work is to provide easy access to syntactic and semantic annotations for researchers of both NLP and IR communities by building a reference corpus to homogenize experiments and make results comparable. These resources, a semantically annotated corpus and a “entity containment” derived graph, are licensed under the GNU Free Documentation License and available from http://www.yr-bcn.es/semanticWikipedia
Language Single language
Topics Corpus (creation, annotation, etc.), Information Extraction, Information Retrieval, Acquisition, Machine Learning
Full paper Semantically Annotated Snapshot of the English Wikipedia
Slides Semantically Annotated Snapshot of the English Wikipedia
Bibtex @InProceedings{ATSERIAS08.581,
  author = {Jordi Atserias, Hugo Zaragoza, Massimiliano Ciaramita and Giuseppe Attardi},
  title = {Semantically Annotated Snapshot of the English Wikipedia},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA