Title |
A lexicon for biology and bioinformatics: the BOOTStrep experience. |
Authors |
Valeria Quochi, Monica Monachini, Riccardo Del Gratta and Nicoletta Calzolari |
Abstract |
This paper describes the design, implementation and population of a lexical resource for biology and bioinformatics (the BioLexicon) developed within an ongoing European project. The aim of this project is text-based knowledge harvesting for support to information extraction and text mining in the biomedical domain. The BioLexicon is a large-scale lexical-terminological resource encoding different information types in one single integrated resource. In the design of the resource we follow the ISO/DIS 24613 Lexical Mark-up Framework standard, which ensures reusability of the information encoded and easy exchange of both data and architecture. The design of the resource also takes into account the needs of our text mining partners who automatically extract syntactic and semantic information from texts and feed it to the lexicon. The present contribution first describes in detail the model of the BioLexicon along its three main layers: morphology, syntax and semantics; then, it briefly describes the database implementation of the model and the population strategy followed within the project, together with an example. The BioLexicon database in fact comes equipped with automatic uploading procedures based on a common exchange XML format, which guarantees that the lexicon can be properly populated with data coming from different sources. |
Language |
Single language |
Topics |
Lexicon, lexical database, Standards for LRs, Other |
Full paper |
A lexicon for biology and bioinformatics: the BOOTStrep experience. |
Slides |
- |
Bibtex |
@InProceedings{QUOCHI08.576,
author = {Valeria Quochi, Monica Monachini, Riccardo Del Gratta and Nicoletta Calzolari},
title = {A lexicon for biology and bioinformatics: the BOOTStrep experience.},
booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
year = {2008},
month = {may},
date = {28-30},
address = {Marrakech, Morocco},
editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
publisher = {European Language Resources Association (ELRA)},
isbn = {2-9517408-4-0},
note = {http://www.lrec-conf.org/proceedings/lrec2008/},
language = {english}
} |