Title |
Named Entity Recognition in Questions: Towards a Golden Collection |
Authors |
Ana Cristina Mendes, Luísa Coheur and Paula Vaz Lobo |
Abstract |
Named Entity Recognition (NER) plays a relevant role in several Natural Language Processing tasks. Question-Answering (QA) is an example of such, since answers are frequently named entities in agreement with the semantic category expected by a given question. In this context, the recognition of named entities is usually applied in free text data. NER in natural language questions can also aid QA and, thus, should not be disregarded. Nevertheless, it has not yet been given the necessary importance. In this paper, we approach the identification and classification of named entities in natural language questions. We hypothesize that NER results can benefit with the inclusion of previously labeled questions in the training corpus. We present a broad study addressing that hypothesis, focusing on the balance to be achieved between the amount of free text and questions in order to build a suitable training corpus. This work also contributes by providing a set of nearly 5,500 annotated questions with their named entities, freely available for research purposes. |
Topics |
Corpus (creation, annotation, etc.), Named Entity recognition, Question Answering |
Full paper |
Named Entity Recognition in Questions: Towards a Golden Collection |
Slides |
- |
Bibtex |
@InProceedings{MENDES10.97,
author = {Ana Cristina Mendes and Luísa Coheur and Paula Vaz Lobo}, title = {Named Entity Recognition in Questions: Towards a Golden Collection}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |