LREC 2000 2nd International Conference on Language Resources & Evaluation | ||||||
Title | Annotating Resources for Information Extraction |
Authors | Boisen Sean (BBN Technologies 87 Fawcett Street, Cambridge MA 02138 , email: Sean.Boisen@bbn.com) Crystal Michael R. (BBN Technologies 87 Fawcett Street, Cambridge MA 02138 ) Schwartz Richard (BBN Technologies 87 Fawcett Street, Cambridge MA 02138 ) Stone Rebecca (BBN Technologies 87 Fawcett Street, Cambridge MA 02138 ) Weischedel Ralph (BBN Technologies 87 Fawcett Street, Cambridge MA 02138) |
Keywords | Annotation, Information Extraction, Named Entity Extraction, Trained Systems |
Session | Session WO14 - Named Entity Recognition |
Full Paper | 263.ps, 263.pdf |
Abstract | Trained systems for NE extraction have shown significant promise because of their robustness to errorful input and rapid adaptability. However, these learning algorithms have transferred the cost of development from skilled computational linguistic expertise to data annotation, putting a new premium on effective ways to produce high-quality annotated resources at minimal cost. The paper reflects on BBN’s four years of experience in the annotation of training data for Named Entity (NE) extraction systems discussing useful techniques for maximizing data quality and quantity. |