Summary of the paper

Title Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora
Authors Margarita Alonso Ramos, Leo Wanner, Orsolya Vincze, Gerard Casamayor del Bosque, Nancy Vázquez Veiga, Estela Mosqueira Suárez and Sabela Prieto González
Abstract Collocations play a significant role in second language acquisition. In order to be able to offer efficient support to learners, an NLP-based CALL environment for learning collocations should be based on a representative collocation error annotated learner corpus. However, so far, no theoretically-motivated collocation error tag set is available. Existing learner corpora tag collocation errors simply as “lexical errors” ― which is clearly insufficient given the wide range of different collocation errors that the learners make. In this paper, we present a fine-grained three-dimensional typology of collocation errors that has been derived in an empirical study from the learner corpus CEDEL2 compiled by a team at the Autonomous University of Madrid. The first dimension captures whether the error concerns the collocation as a whole or one of its elements; the second dimension captures the language-oriented error analysis, while the third exemplifies the interpretative error analysis. To facilitate a smooth annotation along this typology, we adapted Knowtator, a flexible off-the-shelf annotation tool implemented as a Protégé plugin.
Topics Corpus (creation, annotation, etc.), MultiWord Expressions & Collocations, Other
Full paper Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora
Slides Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora
Bibtex @InProceedings{ALONSORAMOS10.751,
  author = {Margarita Alonso Ramos and Leo Wanner and Orsolya Vincze and Gerard Casamayor del Bosque and Nancy Vázquez Veiga and Estela Mosqueira Suárez and Sabela Prieto González},
  title = {Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora},
  booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)},
  year = {2010},
  month = {may},
  date = {19-21},
  address = {Valletta, Malta},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-6-7},
  language = {english}
 }
Powered by ELDA © 2010 ELDA/ELRA