Summary of the paper

Title Developments of “Lëtzebuergesch” Resources for Automatic Speech Processing and Linguistic Studies
Authors Martine Adda-Decker, Thomas Pellegrini, Eric Bilinski and Gilles Adda
Abstract In the present contribution we start with an overview of the linguistic situation of Luxembourg. We then describe specificities of spoken and written Lëtzebuergesch, with respect to automatic speech processing. Multilingual code-switching and code-mixing, poor writing standardization as compared to languages such as English or French, a large diversity of spoken varieties, together with a limited written production of Lëtzebuergesch language contribute to pose many interesting challenges to automatic speech processing both for speech technologies and linguistic studies. Multilingual filtering has been investigated to sort out Luxembourgish from German and French. Word list coverage and language model perplexity results, using sibling resources collected from the Web, are presented. A phonemic inventory has been adopted for pronunciation dictionary development, a grapheme-phoneme tool has been developed and pronunciation research issues related to the multilingual context are highlighted. Results achieved in resource development allow to envision the realisation of an ASR system.
Language Multiple languages
Topics Endangered languages, Corpus (creation, annotation, etc.), Speech recognition and understanding
Full paper Developments of “Lëtzebuergesch” Resources for Automatic Speech Processing and Linguistic Studies
Slides -
Bibtex @InProceedings{ADDADECKER08.855,
  author = {Martine Adda-Decker, Thomas Pellegrini, Eric Bilinski and Gilles Adda},
  title = {Developments of “Lëtzebuergesch” Resources for Automatic Speech Processing and Linguistic Studies},
  booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
  year = {2008},
  month = {may},
  date = {28-30},
  address = {Marrakech, Morocco},
  editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {2-9517408-4-0},
  note = {http://www.lrec-conf.org/proceedings/lrec2008/},
  language = {english}
  }

Powered by ELDA © 2008 ELDA/ELRA