Title |
Using Transfer Learning to Assist Exploratory Corpus Annotation |
Authors |
Paul Felt, Eric Ringger, Kevin Seppi and Kristian Heal |
Abstract |
We describe an under-studied problem in language resource management: that of providing automatic assistance to annotators working in exploratory settings. When no satisfactory tagset already exists, such as in under-resourced or undocumented languages, it must be developed iteratively while annotating data. This process naturally gives rise to a sequence of datasets, each annotated differently. We argue that this problem is best regarded as a transfer learning problem with multiple source tasks. Using part-of-speech tagging data with simulated exploratory tagsets, we demonstrate that even simple transfer learning techniques can significantly improve the quality of pre-annotations in an exploratory annotation. |
Topics |
Endangered Languages, Part-of-Speech Tagging |
Full paper |
Using Transfer Learning to Assist Exploratory Corpus Annotation |
Bibtex |
@InProceedings{FELT14.147,
author = {Paul Felt and Eric Ringger and Kevin Seppi and Kristian Heal}, title = {Using Transfer Learning to Assist Exploratory Corpus Annotation}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |