Title |
Inter-sentential Relations in Information Extraction Corpora |
Authors |
Kumutha Swampillai and Mark Stevenson |
Abstract |
In natural language relationships between entities can asserted within a single sentence or over many sentences in a document. Many information extraction systems are constrained to extracting binary relations that are asserted within a single sentence (single-sentence relations) and this limits the proportion of relations they can extract since those expressed across multiple sentences (inter-sentential relations) are not considered. The analysis in this paper focuses on finding the distribution of inter-sentential and single-sentence relations in two corpora used for the evaluation of Information Extraction systems: the MUC6 corpus and the ACE corpus from 2003. In order to carry out this analysis we had to manually mark up all the management succession relations described in the MUC6 corpus. It was found that inter-sentential relations constitute 28.5% and 9.4% of the total number of relations in MUC6 and ACE03 respectively. This places upper bounds on the recall of information extraction systems that do not consider relations that are asserted across multiple sentences (71.5% and 90.6% respectively). |
Topics |
Corpus (creation, annotation, etc.), Information Extraction, Information Retrieval, Validation of LRs |
Full paper |
Inter-sentential Relations in Information Extraction Corpora |
Slides |
- |
Bibtex |
@InProceedings{SWAMPILLAI10.905,
author = {Kumutha Swampillai and Mark Stevenson}, title = {Inter-sentential Relations in Information Extraction Corpora}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |