Title |
Towards a Learning Approach for Abbreviation Detection and Resolution. |
Authors |
Klaar Vanopstal, Bart Desmet and Véronique Hoste |
Abstract |
The explosion of biomedical literature and with it the -uncontrolled- creation of abbreviations presents some special challenges for both human readers and computer applications. We developed an annotated corpus of Dutch medical text, and experimented with two approaches to abbreviation detection and resolution. Our corpus is composed of abstracts from two medical journals from the Low Countries in which approximately 65 percent (NTvG) and 48 percent (TvG) of the abbreviations have a corresponding full form in the abstract. Our first approach, a pattern-based system, consists of two steps: abbreviation detection and definition matching. This system has an average F-score of 0.82 for the detection of both defined and undefined abbreviations and an average F-score of 0.77 was obtained for the definitions. For our second approach, an SVM-based classifier was used on the preprocessed data sets, leading to an average F-score of 0.93 for the abbreviations; for the definitions an average F-score of 0.82 was obtained. |
Topics |
Corpus (creation, annotation, etc.), Text mining, Statistical and machine learning methods |
Full paper |
Towards a Learning Approach for Abbreviation Detection and Resolution. |
Slides |
Towards a Learning Approach for Abbreviation Detection and Resolution. |
Bibtex |
@InProceedings{VANOPSTAL10.737,
author = {Klaar Vanopstal and Bart Desmet and Véronique Hoste}, title = {Towards a Learning Approach for Abbreviation Detection and Resolution.}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |