Title |
GRASS: the Graz corpus of Read And Spontaneous Speech |
Authors |
Barbara Schuppler, Martin Hagmueller, Juan A. Morales-Cordovilla and Hannes Pessentheiner |
Abstract |
This paper provides a description of the preparation, the speakers, the recordings, and the creation of the orthographic transcriptions of the first large scale speech database for Austrian German. It contains approximately 1900 minutes of (read and spontaneous) speech produced by 38 speakers. The corpus consists of three components. First, the Conversation Speech (CS) component contains free conversations of one hour length between friends, colleagues, couples, or family members. Second, the Commands Component (CC) contains commands and keywords which were either read or elicited by pictures. Third, the Read Speech (RS) component contains phonetically balanced sentences and digits. The speech of all components has been recorded at super-wideband quality in a soundproof recording-studio with head-mounted microphones, large-diaphragm microphones, a laryngograph, and with a video camera. The orthographic transcriptions, which have been created and subsequently corrected manually, contain approximately 290 000 word tokens from 15 000 different word types. |
Topics |
Speech Resource/Database, Phonetic Databases, Phonology |
Full paper |
GRASS: the Graz corpus of Read And Spontaneous Speech |
Bibtex |
@InProceedings{SCHUPPLER14.394,
author = {Barbara Schuppler and Martin Hagmueller and Juan A. Morales-Cordovilla and Hannes Pessentheiner}, title = {GRASS: the Graz corpus of Read And Spontaneous Speech}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |