Title |
Comparison between two models of language for the automatic phonetic labeling of an undocumented language of the South-Asia: the case of Mo Piu |
Authors |
Geneviève Caelen-Haumont and Sethserey Sam |
Abstract |
This paper aims at assessing the automatic labeling of an undocumented, unknown, unwritten and under-resourced language (Mo Piu) of the North Vietnam, by an expert phonetician. In the previous stage of the work, 7 sets of languages were chosen among Mandarin, Vietnamese, Khmer, English, French, to compete in order to select the best models of languages to be used for the phonetic labeling of Mo Piu isolated words. Two sets of languages (1° Mandarin + French, 2° Vietnamese + French) which got the best scores showed an additional distribution of their results. Our aim is now to study this distribution more precisely and more extensively, in order to statistically select the best models of languages and among them, the best sets of phonetic units which minimize the wrong phonetic automatic labeling. |
Topics |
Phonetic Databases, Phonology, Discourse annotation, representation and processing, Statistical and machine learning methods |
Full paper |
Comparison between two models of language for the automatic phonetic labeling of an undocumented language of the South-Asia: the case of Mo Piu |
Bibtex |
@InProceedings{CAELENHAUMONT12.208,
author = {Geneviève Caelen-Haumont and Sethserey Sam}, title = {Comparison between two models of language for the automatic phonetic labeling of an undocumented language of the South-Asia: the case of Mo Piu}, booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)}, year = {2012}, month = {may}, date = {23-25}, address = {Istanbul, Turkey}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-7-7}, language = {english} } |