SUMMARY : Session P12-W
Title | Identifying Named Entities in Text Databases from the Natural History Domain |
---|---|
Authors | C. Sporleder, M. Erp, T. Porcelijn, A. Bosch, P. Arntzen |
Abstract | In this paper, we investigate whether it is possible to bootstrap a named entity tagger for textual databases by exploiting the database structure to automatically generate domain and database-specific gazetteer lists. We compare three tagging strategies: (i) using the extracted gazetteers in a look-up tagger, (ii) using the gazetteers to automatically extract training data to train a database-specific tagger, and (iii) using a generic named entity tagger. Our results suggest that automatically built gazetteers in combination with a look-up tagger lead to a relatively good performance and that generic taggers do not perform particularly well on this type of data. |
Keywords | Named-Entity TaggingText DatabasesMachine Learning |
Full paper | Identifying Named Entities in Text Databases from the Natural History Domain |