Summary of the paper

Title MWEs in Treebanks: From Survey to Guidelines
Authors Victoria Rosén, Koenraad De Smedt, Gyri Smørdal Losnegaard, Eduard Bejček, Agata Savary and Petya Osenova
Abstract By means of an online survey, we have investigated ways in which various types of multiword expressions are annotated in existing treebanks. The results indicate that there is considerable variation in treatments across treebanks and thereby also, to some extent, across languages and across theoretical frameworks. The comparison is focused on the annotation of light verb constructions and verbal idioms. The survey shows that the light verb constructions either get special annotations as such, or are treated as ordinary verbs, while VP idioms are handled through different strategies. Based on insights from our investigation, we propose some general guidelines for annotating multiword expressions in treebanks. The recommendations address the following application-based needs: distinguishing MWEs from similar but compositional constructions; searching distinct types of MWEs in treebanks; awareness of literal and nonliteral meanings; and normalization of the MWE representation. The cross-lingually and cross-theoretically focused survey is intended as an aid to accessing treebanks and an aid for further work on treebank annotation.
Topics Standards for LRs, Corpus (Creation, Annotation, etc.), MultiWord Expressions & Collocations
Full paper MWEs in Treebanks: From Survey to Guidelines
Bibtex @InProceedings{ROSN16.1131,
  author = {Victoria Rosén and Koenraad De Smedt and Gyri Smørdal Losnegaard and Eduard Bejček and Agata Savary and Petya Osenova},
  title = {MWEs in Treebanks: From Survey to Guidelines},
  booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)},
  year = {2016},
  month = {may},
  date = {23-28},
  location = {Portorož, Slovenia},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {978-2-9517408-9-1},
  language = {english}
 }
Powered by ELDA © 2016 ELDA/ELRA