LREC 2000 2nd International Conference on Language Resources & Evaluation | ||||||
Title | Semantic Tagging for the Penn Treebank |
Authors | Palmer Martha (Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA 19104, USA, mpalmer@linc.cis.upenn.edu) Trang Dang Hoa (University of Pennsylvania, 200 South 33rd Street, Philadelphia, PA, USA, htd@linc.cis.upenn.edu) Rosenzweig Joseph (University of Pennsylvania, 200 South 33rd Street, Philadelphia, PA, USA, josephr@linc.cis.upenn.edu) |
Keywords | Inter-Annotator Agreement, Predicate-Argument Structure, Sense Distinctions, Training Data |
Session | Session WO10 - Semantic Annotation of Corpora |
Full Paper | 197.ps, 197.pdf |
Abstract | This paper describes the methodology that is being used to augment the Penn Treebank annotation with sense tags and other types of semantic information. Inspired by the results of SENSEVAL, and the high inter-annotator agreement that was achieved there, similar methods were used for a pilot study of 5000 words of running text from the Penn Treebank. Using the same techniques of allowing the annotators to discuss difficult tagging cases and to revise WordNet entries if necessary, comparable inter-annotator rates have been achieved. The criteria for determining appropriate revisions and ensuring clear sense distinctions are described. We are also using hand correction of automatic predicate argument structure information to provide additional thematic role labeling. |