LREC 2000 2nd International Conference on Language Resources & Evaluation | ||||||
Title | Creation of Spoken Hebrew Databases |
Authors | Rannon Tami (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, tami_r@nsc.co.il) Golani Ofra (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, ofrag@nsc.co.il) Goren Anat (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, anattr@nsc.co.il) Shammass Sherrie (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, shaunie@nsc.co.il) Moyal Ami (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, amym@nsc.co.il) |
Keywords | Hebrew, Semetic Language SR, Speech Recognition, Spoken Database, Telephony Applications |
Session | Session SP3 - Spoken Language Resources' Projects |
Full Paper | 52.ps, 52.pdf |
Abstract | Two Spoken Hebrew databases were collected over fixed telephone lines at NSC - Natural Speech Communication. Their creation was based on the SpeechDat model, and represents the first comprehensive spoken database in Modern Hebrew that can be successfully applied to the teleservices industry. The speakers are a representative sample of Israelis, based on sociolinguistic factors such as age, gender, years of education and country of origin. The database includes, digit sequences, natural numbers, money amounts, time expressions, dates, spelled words, application words and phrases for teleservices (e.g., call, save, play), phonetically rich words, phonetically rich sentences, and names. Both read speech and spontaneous speech were elicited. |