Summary of the paper

Title A Swedish Scientific Medical Corpus for Terminology Management and Linguistic Exploration
Authors Dimitrios Kokkinakis and Ulla Gerdin
Abstract This paper describes the development of a new Swedish scientific medical corpus. We provide a detailed description of the characteristics of this new collection as well results of an application of the corpus on term management tasks, including terminology validation and terminology extraction. Although the corpus is representative for the scientific medical domain it still covers in detail a lot of specialised sub-disciplines such as diabetes and osteoporosis which makes it suitable for facilitating the production of smaller but more focused sub-corpora. We address this issue by making explicit some features of the corpus in order to demonstrate the usability of the corpus particularly for the quality assessment of subsets of official terminologies such as the Systematized NOmenclature of MEDicine - Clinical Terms (SNOMED CT). Domain-dependent language resources, labelled or not, are a crucial key components for progressing R&D in the human language technology field since such resources are an indispensable, integrated part for terminology management, evaluation, software prototyping and design validation and a prerequisite for the development and evaluation of a number of sublanguage dependent applications including information extraction, text mining and information retrieval.
Topics Corpus (creation, annotation, etc.), Evaluation methodologies
Full paper A Swedish Scientific Medical Corpus for Terminology Management and Linguistic Exploration
Slides -
Bibtex @InProceedings{KOKKINAKIS10.60,
  author = {Dimitrios Kokkinakis and Ulla Gerdin},
  title = {A Swedish Scientific Medical Corpus for Terminology Management and Linguistic Exploration},
  booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA