Title |
Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora |
Authors |
Margarita Alonso Ramos, Leo Wanner, Orsolya Vincze, Gerard Casamayor del Bosque, Nancy Vázquez Veiga, Estela Mosqueira Suárez and Sabela Prieto González |
Abstract |
Collocations play a significant role in second language acquisition. In order to be able to offer efficient support to learners, an NLP-based CALL environment for learning collocations should be based on a representative collocation error annotated learner corpus. However, so far, no theoretically-motivated collocation error tag set is available. Existing learner corpora tag collocation errors simply as lexical errors ― which is clearly insufficient given the wide range of different collocation errors that the learners make. In this paper, we present a fine-grained three-dimensional typology of collocation errors that has been derived in an empirical study from the learner corpus CEDEL2 compiled by a team at the Autonomous University of Madrid. The first dimension captures whether the error concerns the collocation as a whole or one of its elements; the second dimension captures the language-oriented error analysis, while the third exemplifies the interpretative error analysis. To facilitate a smooth annotation along this typology, we adapted Knowtator, a flexible off-the-shelf annotation tool implemented as a Protégé plugin. |
Topics |
Corpus (creation, annotation, etc.), MultiWord Expressions & Collocations, Other |
Full paper |
Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora |
Slides |
Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora |
Bibtex |
@InProceedings{ALONSORAMOS10.751,
author = {Margarita Alonso Ramos and Leo Wanner and Orsolya Vincze and Gerard Casamayor del Bosque and Nancy Vázquez Veiga and Estela Mosqueira Suárez and Sabela Prieto González}, title = {Towards a Motivated Annotation Schema of Collocation Errors in Learner Corpora}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |