SUMMARY : Session O9-SE TTS & Units for TTS

 

Title Creation and analysis of a Polish speech database for use in unit selection synthesis
Authors D. Oliver, K. Szklanny
Abstract The main aim of this study is to describe the process of creating a speech database to be used in corpus based text-to-speech synthesis. To help achieve natural sounding speech synthesis, the database construction was aimed at rich phonetic and prosodic coverage based on variable length units (phoneme, diphone, triphone) from different phonetic and prosodic contexts. Following previous work on determining the optimal coverage (Szklanny and Oliver, 2005), text selection was based on the existing text corpus containing parliamentary statements. Corpus balancing was followed by recording of the material. Automatic segmentation was performed, followed by both an automatic and manual check of the data to determine speaker specific phenomena and correct the labelling. Additionally, prosodic annotation involving assignment of the intonation contours was performed in order to assess the accent realisation and determine the prosodic coverage of the database. The prototype speech synthesiser was built to determine the validity of the above steps and test the resulting voice quality.
Keywords speech synthesis, database creation, human-computer communication,automatic segmentation, grapheme to phoneme conversion,unit selection,
Full paper Creation and analysis of a Polish speech database for use in unit selection synthesis