LREC 2006 - Proceedings sorted by papers

Title	FonDat1: A Speech Synthesis Corpus for Norwegian
Authors	I. Amdal, T. Svendsen
Abstract	This paper describes the Norwegian speech database FonDat1 designedfor development and assessment of Norwegian unit selection speechsynthesis. The quality of unit selection speech synthesis systems depends highly on the database used. The database should contain sufficient phonemicand prosodic coverage. High quality unit selection synthesis alsorequires that the database is annotated with accurate information about identity and position of the units.Traditionally this involves much manual work, either by hand labelingthe entire database or by correcting automatic annotations. We are working on methods for a complete automation of the annotationprocess. To validate these methods a realistic unit selectionsynthesis database is needed.In addition to serve as a testbed for annotation tools and synthesisexperiments, the process of producing the database using automaticmethods is in itself an important result.FonDat1 contains studio recordings of approximately 2000 sentencesread by two professional speakers, one male and one female. 10% ofthe database is manually annotated.
Keywords	Norwegian, speech corpus, speech synthesis
Full paper	FonDat1: A Speech Synthesis Corpus for Norwegian

Title

FonDat1: A Speech Synthesis Corpus for Norwegian

Authors

I. Amdal, T. Svendsen

Abstract

This paper describes the Norwegian speech database FonDat1 designedfor development and assessment of Norwegian unit selection speechsynthesis. The quality of unit selection speech synthesis systems depends highly on the database used. The database should contain sufficient phonemicand prosodic coverage. High quality unit selection synthesis alsorequires that the database is annotated with accurate information about identity and position of the units.Traditionally this involves much manual work, either by hand labelingthe entire database or by correcting automatic annotations. We are working on methods for a complete automation of the annotationprocess. To validate these methods a realistic unit selectionsynthesis database is needed.In addition to serve as a testbed for annotation tools and synthesisexperiments, the process of producing the database using automaticmethods is in itself an important result.FonDat1 contains studio recordings of approximately 2000 sentencesread by two professional speakers, one male and one female. 10% ofthe database is manually annotated.

Keywords

Norwegian, speech corpus, speech synthesis

Full paper

FonDat1: A Speech Synthesis Corpus for Norwegian