Title

Spoken and Written Language Resources for Vietnamese

Author(s)

Viet-Bac Le (1), Do-Dat Tran (1, 2), Eric Castelli (2), Laurent Besacier (1), Jean-François Serignat (1)

(1) CLIPS-IMAG Laboratory, UMR CNRS 5524, BP 53, 38041 Grenoble Cedex 9, FRANCE; (2) International Research Center MICA; 1 Dai Co Viet, Hanoi, VIETNAM

Session

P9-SE

Abstract

This paper presents an overview of our activities for spoken and written language resources for Vietnamese implemented at CLIPS-IMAG Laboratory and International Research Center MICA. A new methodology for fast text corpora acquisition for minority languages which has been applied to Vietnamese is proposed. The first results of a process of building a large Vietnamese speech database (VNSpeechCorpus) and a phonetic dictionary, which is used for automatic alignment process, are also presented.

Keyword(s)

Vietnamese language, Minority language, Speech corpus, Text corpus, Pronunciation dictionary

Language(s) Vietnamese
Full Paper

586.pdf