SUMMARY : Session O4-S Speech Corpora and Dialogue
Title | Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News |
---|---|
Authors | S. Galliano, E. Geoffrois, G. Gravier, J. Bonastre, D. Mostefa, K. Choukri |
Abstract | This paper presents the audio corpus developed in the framework of the ESTER evaluation campaign of French broadcast news transcription systems. This corpus includes 100 hours of manually annotated recordings and 1,677 hours of non transcribed data. The manual annotations include the detailed verbatim orthographic transcription, the speaker turns and identities, information about acoustic conditions, and name entities. Additional resources generated by automatic speech processing systems, such as phonetic alignments and word graphs, are also described. |
Keywords | corpus, transcription, broadcast news, evaluation |
Full paper | Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News |