Title

Acoustic Modeling and Training of a Bilingual ASR System when a Minority Language is Involved

Authors

Laura Docio-Fernandez (Departamento de Teoria de la Seņal y Comunicaciones E.T.S.I. Telecomunicacion Campus Universitario de Vigo 36200 VIGO,SPAIN)

Carmen Garcia-Mateo (Departamento de Teoria de la Seņal y Comunicaciones E.T.S.I. Telecomunicacion Campus Universitario de Vigo 36200 VIGO,SPAIN)

Session

SP2: Speech Varieties And Multilingual ASR

Abstract

This paper describes our work in developing a bilingual speech recognition system using two SpeechDat databases. The bilingual aspect of this work is of particular importance in the Galician region of Spain where both languages Galician and Spanish coexist and one of the languages, the Galician one, is a minority language. Based on a global  Spanish-Galician phoneme set we built a bilingual speech recognition system which can handle both languages: Spanish and Galician. The recognizer makes use of context dependent acoustic models based on continuous density hidden Markov models. The system has been evaluated on a isolated-word large-vocabulary task. The tests show that Spanish system exhibits a better performance than the Galician system due to its better training. The bilingual system provides an equivalent performance to that achieved by the language specific systems.

Keywords

Minority languages, Multilingual ASR systems

Full Paper

16.pdf