Title |
Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual |
Authors |
Xuansong Li, Stephanie Strassel, Heng Ji, Kira Griffitt and Joe Ellis |
Abstract |
To advance information extraction and question answering technologies toward a more realistic path, the U.S. NIST (National Institute of Standards and Technology) initiated the KBP (Knowledge Base Population) task as one of the TAC (Text Analysis Conference) evaluation tracks. It aims to encourage research in automatic information extraction of named entities from unstructured texts with the ultimate goal of integrating such information into a structured Knowledge Base. The KBP track consists of two types of evaluation: Named Entity Linking (NEL) and Slot Filling. This paper describes the linguistic resource creation efforts at the Linguistic Data Consortium (LDC) in support of Named Entity Linking evaluation of KBP, focusing on annotation methodologies, process, and features of corpora from 2009 to 2011, with a highlighted analysis of the cross-lingual NEL data. Progressing from monolingual to cross-lingual Entity Linking technologies, the 2011 cross-lingual NEL evaluation targeted multilingual capabilities. Annotation accuracy is presented in comparison with system performance, with promising results from cross-lingual entity linking systems. |
Topics |
Corpus (creation, annotation, etc.), Information Extraction, Information Retrieval, Multilinguality |
Full paper |
Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual |
Bibtex |
@InProceedings{LI12.278,
author = {Xuansong Li and Stephanie Strassel and Heng Ji and Kira Griffitt and Joe Ellis}, title = {Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual}, booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)}, year = {2012}, month = {may}, date = {23-25}, address = {Istanbul, Turkey}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-7-7}, language = {english} } |