ABSTRACT
This paper describes the design of 1000 speaker Slovenian SpeechDat database,
which was recorded within the SpeechDat II project (LE2-4001). The aim of the
project is the production of speech databases which will be used as basis
for developing automatic telephone speech dialogue systems.
The database was recorded over the PSTN network via an ISDN connection. An automatic recording platform was used, comprised of a proprietary software running on a PC computer.
The speakers were predominantly recruited among the employees of the Post of Slovenia. Thus the requirements for balance of age, gender and dialect distribution could be easily fulfilled. The speakers from the Post were required to complete the recording session, which was checked and fed back to the Post of Slovenia. The rest of the speakers were students and employees of University of Maribor and their relatives.
The database consists of 43 utterances per speaker. Utterances are stored as sequences of 8-bit, 8kHz A-law speech samples. Each utterance is stored in a separate file and has an adjoined SAM label file with the transcription of the utterance. Used is orthographic transcription with few details that represent audible speech and non-speech acoustic events. All transcriptions were made manually.
Corpus of the database includes the following read and spontaneous items: application words, isolated digits, sequences of isolated digits, connected digits, dates, word spotting phrases, spelled words/phrases,currency amounts, natural numbers, directory assistance names, questions, time phrases, phonetically reach words and sentences.
The database will be distributed on 5 CD-ROM volumes and will be distributed by ELRA.