Summary of the paper

Title A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions
Authors Amalia Todiraşcu, Dan Tufiş, Ulrich Heid, Christopher Gledhill, Dan Ştefanescu, Marion Weller and François Rousselot
Abstract We present the main findings and preliminary results of an ongoing project aimed at developing a system for collocation extraction based on contextual morpho-syntactic properties. We explored two hybrid extraction methods: the first method applies language-indepedent statistical techniques followed by a linguistic filtering, while the second approach, available only for German, is based on a set of lexico-syntactic patterns to extract collocation candidates. To define extraction and filtering patterns, we studied a specific collocation category, the Verb-Noun constructions, using a model inspired by the systemic functional grammar, proposing three level analysis: lexical, functional and semantic criteria. From tagged and lemmatized corpus, we identify some contextual morpho-syntactic properties helping to filter the output of the statistical methods and to extract some potential interesting VN constructions (complex predicates vs complex predicators). The extracted candidates are validated and classified manually.
Language Multiple languages
Topics MultiWord Expressions & Collocations, Statistical methods, Multilinguality
Full paper A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions
Slides -
Bibtex @InProceedings{TODIRACU08.500,
  author = {Amalia Todiraşcu, Dan Tufiş, Ulrich Heid, Christopher Gledhill, Dan Ştefanescu, Marion Weller and François Rousselot},
  title = {A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA