LREC 2000 2nd International Conference on Language Resources & Evaluation | ||||||
Title | What are Transcription Errors and Why are They made? |
Authors | Oppermann Daniela (Institute of Phonetics and Speech Communication, Schellingstr. 3,80799 Munich, Germany, daniela.oppermann@phonetik.uni-muenchen.de) Burger Susanne (Interactive Systems Laboratories, Carnegie Mellon Univeristy Pittsburgh, USA, University of Karlsruhe, Germany, sburger@cs.cmu.edu) Weilhammer Karl (Institute of Phonetics and Speech Communication, Schellingstr. 3,80799 Munich, Germany, karl.weilhammer@phonetik.uni-muenchen.de) |
Keywords | Annotation Errors, Data-Collection, Spontaneous Speech, Transcription |
Session | Session SP2 - Spoken Language Resources Issues from Construction to Validation |
Full Paper | 205.ps, 205.pdf |
Abstract | In recent work we compared transcriptions of German spontaneous dialogues of the VERBMOBIL corpus to ascertain differences between transcribers and quality. A better understanding of where and what kind of inconsistencies occur will help us to improve the working environment for transcribers, to reduce the effort on correction passes, and will finally result in better transcription quality. The results show that transcribers have different levels of perception of spontaneous speech phenomena, mainly prosodic phenomena such as pauses in speech and lengthening. During the correction pass 80% of these labels had to be inserted. Additionally, the annotation of non-grammatical phrases and pronunciation comments seems to need a better explanation in the convention manual. Here the correcting transcribers had to change 20% of the annotations. |