LREC 2000 2nd International Conference on Language Resources & Evaluation | |
Conference Papers
Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377. |
Previous Paper Next Paper
Title | The ISLE Corpus of Non-Native Spoken English |
Authors |
Menzel Wolfgang (Universitat Hamburg, Fachbereich Informatik, Vogt-Kolln-Strasse 30, 22527 Hamburg, Germany, menzel@informatik.uni-hamburg.de) Atwell Eric (School of Computer Studies, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, United Kingdom, eric@scs.leeds.ac.uk) Bonaventura Patrizia (Universitat Hamburg, Fachbereich Informatik, Vogt-Kolln-Strasse 30, 22527 Hamburg, Germany, pbonaven@informatik.uni-hamburg.de) Herron Daniel (Universitat Hamburg, Fachbereich Informatik, Vogt-Kolln-Strasse 30, 22527 Hamburg, Germany, herron@informatik.uni-hamburg.de) Howarth Peter (University of Leeds, Woodhouse Lane, Leeds LS2 9JT, Great Britain, p.a.howarth@leeds.ac.uk) Morton Rachel (Entropic Cambridge Research Labs, Compass House, 80-82 Newmarket Road, Cambridge, CB1 4LD, Great Britain, rim@entropic.co.uk) Souter Clive (School of Computer Studies, University of Leeds, Woodhouse Lane, Leeds LS2 9JT, United Kingdom, cs@scs.leeds.ac.uk) |
Keywords | Non-Native Speech, Pronunciation Training, Speech Corpus Annotation, Speech Corpus Design, Speech Recognition |
Session | Session SP3 - Spoken Language Resources' Projects |
Abstract | For the purpose of developing pronunciation training tools for second language learning a corpus of non-native speech data has been collected, which consists of almost 18 hours of annotated speech signals spoken by Italian and German learners of English. The corpus is based on 250 utterances selected from typical second language learning exercises. It has been annotated at the word and the phone level, to highlight pronunciation errors such as phone realisation problems and misplaced word stress assignments. The data has been used to develop and evaluate several diagnostic components, which can be used to produce corrective feedback of unprecedented detail to a language learner. |