LREC 2006 - Proceedings sorted by papers

Title	Creation and analysis of a Polish speech database for use in unit selection synthesis
Authors	D. Oliver, K. Szklanny
Abstract	The main aim of this study is to describe the process of creating a speech database to be used in corpus based text-to-speech synthesis. To help achieve natural sounding speech synthesis, the database construction was aimed at rich phonetic and prosodic coverage based on variable length units (phoneme, diphone, triphone) from different phonetic and prosodic contexts. Following previous work on determining the optimal coverage (Szklanny and Oliver, 2005), text selection was based on the existing text corpus containing parliamentary statements. Corpus balancing was followed by recording of the material. Automatic segmentation was performed, followed by both an automatic and manual check of the data to determine speaker specific phenomena and correct the labelling. Additionally, prosodic annotation involving assignment of the intonation contours was performed in order to assess the accent realisation and determine the prosodic coverage of the database. The prototype speech synthesiser was built to determine the validity of the above steps and test the resulting voice quality.
Keywords	speech synthesis, database creation, human-computer communication,automatic segmentation, grapheme to phoneme conversion,unit selection,
Full paper	Creation and analysis of a Polish speech database for use in unit selection synthesis

Title

Creation and analysis of a Polish speech database for use in unit selection synthesis

Authors

D. Oliver, K. Szklanny

Abstract

The main aim of this study is to describe the process of creating a speech database to be used in corpus based text-to-speech synthesis. To help achieve natural sounding speech synthesis, the database construction was aimed at rich phonetic and prosodic coverage based on variable length units (phoneme, diphone, triphone) from different phonetic and prosodic contexts. Following previous work on determining the optimal coverage (Szklanny and Oliver, 2005), text selection was based on the existing text corpus containing parliamentary statements. Corpus balancing was followed by recording of the material. Automatic segmentation was performed, followed by both an automatic and manual check of the data to determine speaker specific phenomena and correct the labelling. Additionally, prosodic annotation involving assignment of the intonation contours was performed in order to assess the accent realisation and determine the prosodic coverage of the database. The prototype speech synthesiser was built to determine the validity of the above steps and test the resulting voice quality.

Keywords

speech synthesis, database creation, human-computer communication,automatic segmentation, grapheme to phoneme conversion,unit selection,

Full paper

Creation and analysis of a Polish speech database for use in unit selection synthesis