Title | NameNet: A Self-Improving Resource for Name Classification |
Author(s) |
Paul Morarescu, Sanda Harabagiu
Human Language Technology Research Institute, Department of Computer Science, University of Texas at Dallas |
Session | O15-W |
Abstract | This paper presents a semantically structured resource of more than 1,600 Name Classes. This structure is based on the noun hypernymy hierarchies in WordNet, expanded and validated by corpus evidence collected from the World Wide Web. The set of seed examples provided by WordNet is boostrapped and the used to automatically construct an annotated training corpus for each Name Class. The resulting Named Entity resource enables a supervised Named Entity Recognizer to identify all the encoded Name Classes with high accuracy and without any human intervention. |
Keyword(s) | Named Entity Recognition, Information Extraction |
Language(s) | English |
Full Paper | 693.pdf |