Title |
Duration Modeling for Turkish Text-to-Speech Synthesis System |
Author(s) |
Ö. Öztürk (1), Ö. Salor (2), T. Çiloğlu (2), M. Demirekler (2) (1) Dept. of Electrical and Electronics Eng., Dokuz Eylul Univ., Izmir, Turkey; (2) Dept. of Electrical and Electronics Eng., Middle East Tech. Univ., Ankara, Turkey |
Session |
P27-SE |
Abstract |
Naturalness of synthetic speech depends on appropriate modeling of prosodic aspects. Mostly, three prosody components are modeled: segmental duration, pitch contour and intensity. In this study, we present our work on modeling segmental duration in Turkish by using machine-learning algorithms. The models predict phone durations based on attributes such as phone identity, neighboring phone identities, lexical stress, position of syllable in word, part-of-speech information, word length in number of syllables and position of word in utterance. Obtained models predict segment durations better than mean duration approximations. |
Keyword(s) |
TTS, duration modeling, machine-learning, Turkish |
Language(s) | Turkish |
Full Paper |