Summary of the paper

Title Toward Active Learning in Data Selection: Automatic Discovery of Language Features During Elicitation
Authors Jonathan Clark, Robert Frederking and Lori Levin
Abstract Data Selection has emerged as a common issue in language technologies. We define Data Selection as the choosing of a subset of training data that is most effective for a given task. This paper describes deductive feature detection, one component of a data selection system for machine translation. Feature detection determines whether features such as tense, number, and person are expressed in a language. The database of the The World Atlas of Language Structures provides a gold standard against which to evaluate feature detection. The discovered features can be used as input to a Navigator, which uses active learning to determine which piece of language data is the most important to acquire next.
Language Multiple languages
Topics Corpus (creation, annotation, etc.), Machine Translation, SpeechToSpeech Translation, Typological databases
Full paper Toward Active Learning in Data Selection: Automatic Discovery of Language Features During Elicitation
Slides Toward Active Learning in Data Selection: Automatic Discovery of Language Features During Elicitation
Bibtex @InProceedings{CLARK08.308,
  author = {Jonathan Clark, Robert Frederking and Lori Levin},
  title = {Toward Active Learning in Data Selection: Automatic Discovery of Language Features During Elicitation},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA