Title

Evaluation Corpora for Sense Disambiguation in the Medical Domain

Authors

Diana Raileanu (DFKI GmbH Stuhlsatzenhausweg 3, 66123 Saarbrücken, Germany)

Paul Buitelaar (DFKI GmbH Stuhlsatzenhausweg 3, 66123 Saarbrücken, Germany)

Spela Vintar (DFKI GmbH Stuhlsatzenhausweg 3, 66123 Saarbrücken, Germany)

Jörg Bay (Zinfo, University of Frankfurt 60590 Frankfurt am Main, Germany)

Session

EP1: Evaluation

Abstract

An important aspect of word sense disambiguation is the evaluation of different methods and parameters. Unfortunately, there is a lack of test sets for evaluation, specifically for languages other than English and even more so for specific domains like medicine. Given that our work focuses on English as well as German text in the medical domain, we had to develop our own evaluation corpora in order to test our disambiguation methods. In this paper we describe the work on developing these corpora, using GermaNet and UMLS as (lexical) semantic resources, next to a description of the annotation tool KiC that we developed for support of the annotation task.

Keywords

Evaluation corpora

Full Paper

166.pdf