Title

Development and Evaluation of a Korean Treebank and its Application to NLP 

Authors

Chung-hye Han (Simon Fraser University and University of Pennsylvania)

Na-Rae Han (Simon Fraser University and University of Pennsylvania)

Eon-Suk Ko (Simon Fraser University and University of Pennsylvania)

Martha Palmer (Simon Fraser University and University of Pennsylvania)

Session

WP4: Corpus Annotation

Abstract

This paper discusses issues in building a 54-thousand-word Korean Treebank using a phrase structure annotation, along with developing annotation guidelines based on the morpho-syntactic phenomena represented in the corpus. Various methods that were employed for quality control are presented. The evaluation on the quality of the Treebank and some of the NLP applications under development using the Treebank are also presented.

Keywords

Treebank, Annotated corpus, Korean, Pos tagging, Syntactic bracketing, Morphological tagger, Parsing

Full Paper

151.pdf