Title |
Spoken and Written Language Resources for Vietnamese |
Author(s) |
Viet-Bac Le (1), Do-Dat Tran (1, 2), Eric Castelli (2), Laurent Besacier (1), Jean-François Serignat (1) (1) CLIPS-IMAG Laboratory, UMR CNRS 5524, BP 53, 38041 Grenoble Cedex 9, FRANCE; (2) International Research Center MICA; 1 Dai Co Viet, Hanoi, VIETNAM |
Session |
P9-SE |
Abstract |
This paper presents an overview of our activities for spoken and written language resources for Vietnamese implemented at CLIPS-IMAG Laboratory and International Research Center MICA. A new methodology for fast text corpora acquisition for minority languages which has been applied to Vietnamese is proposed. The first results of a process of building a large Vietnamese speech database (VNSpeechCorpus) and a phonetic dictionary, which is used for automatic alignment process, are also presented. |
Keyword(s) |
Vietnamese language, Minority language, Speech corpus, Text corpus, Pronunciation dictionary |
Language(s) | Vietnamese |
Full Paper |