Title |
Design of Optimal Slovenian Speech Corpus for Use in the Concatenative Speech Synthesis System |
Authors |
Rojc Matej (Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova 17, 2000 Maribor, matej.rojc@uni-mb.si) Kačič Zdravko (Faculty of Electrical Engineering and Computer Science, University of Maribor, Smetanova 17, 2000 Maribor, kacic@uni-mb.si) |
Keywords |
Grapheme-to-Phoneme Conversion, Non-Uniform Units, Text Processing |
Session |
Session SP1 - Phonetic Issues and Speech Synthesis |
Full Paper |
177.ps, 177.pdf |
Abstract |
In the paper the development of Slovenian speech corpus for use in concatenative speech synthesis system being developed at University of Maribor, Slovenia, will be presented. The emphasis in the paper is the issue of maximising the usefulness of the defined speech corpus for concatenation purposes. Usefulness of the speech corpus very much depends on the corresponding text and can be increased if the appropriate text is chosen. In the approach we used, detailed statistics of the text corpora has been done, to be able to define the sentences, rich with non-uniform units like monophones, diphones and triphones. |