LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title Creation of Spoken Hebrew Databases
Authors Rannon Tami (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, tami_r@nsc.co.il)
Golani Ofra (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, ofrag@nsc.co.il)
Goren Anat (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, anattr@nsc.co.il)
Shammass Sherrie (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, shaunie@nsc.co.il)
Moyal Ami (NSC, Natural Speech Communication Ltd., 33 Lazarov ST., P.O. Box 5212, Rishon-LeZion 75150, Israel, amym@nsc.co.il)
Keywords Hebrew, Semetic Language SR, Speech Recognition, Spoken Database, Telephony Applications
Session Session SP3 - Spoken Language Resources' Projects
Full Paper 52.ps, 52.pdf
Abstract Two Spoken Hebrew databases were collected over fixed telephone lines at NSC - Natural Speech Communication. Their creation was based on the SpeechDat model, and represents the first comprehensive spoken database in Modern Hebrew that can be successfully applied to the teleservices industry. The speakers are a representative sample of Israelis, based on sociolinguistic factors such as age, gender, years of education and country of origin. The database includes, digit sequences, natural numbers, money amounts, time expressions, dates, spelled words, application words and phrases for teleservices (e.g., call, save, play), phonetically rich words, phonetically rich sentences, and names. Both read speech and spontaneous speech were elicited.