Summary of the paper

Title Tools for Collocation Extraction: Preferences for Active vs. Passive
Authors Ulrich Heid and Marion Weller
Abstract We present and partially evaluate procedures for the extraction of noun+verb collocation candidates from German text corpora, along with their morphosyntactic preferences, especially for the active vs. passive voice. We start from tokenized, tagged, lemmatized and chunked text, and we use extraction patterns formulated in the CQP corpus query language. We discuss the results of a precision evaluation, on administrative texts from the European Union: we find a considerable amount of specialized collocations, as well as general ones and complex predicates; overall the precision is considerably higher than that of a statistical extractor used as a baseline.
Language
Topics MultiWord Expressions & Collocations, Lexicon, lexical database, Acquisition, Machine Learning
Full paper Tools for Collocation Extraction: Preferences for Active vs. Passive
Slides Tools for Collocation Extraction: Preferences for Active vs. Passive
Bibtex @InProceedings{HEID08.323,
  author = {Ulrich Heid and Marion Weller},
  title = {Tools for Collocation Extraction: Preferences for Active vs. Passive},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA