LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title Spontaneous Speech Corpus of Japanese
Authors Maekawa Kikuo (7KH National Language Research Institute, 3-9-14 Nishiga’oka, Kita-ku, Tokyo 115-8620 Japan, kikuo@kokken.go.jp)
Koiso Hanae (7KH National Language Research Institute, 3-9-14 Nishiga’oka, Kita-ku, Tokyo 115-8620 Japan, koiso@kokken.go.jp)
Furui Sadaoki (Tokyo Institute of Technology, 2-12-1, Ookayama, Meguro-ku, Tokyo 152-8552 Japan, furui@furui.cs.titech.ac.jp)
Isahara Hitoshi (Communications Research Laboratory, 588-2, Iwaoka, Nishi-ku, Kobe 651-2401 Japan, isahara@crl.go.jp)
Keywords Intonaion Labeling, Japanese, Speech Recognition, Spontaneous Speech
Session Session SP3 - Spoken Language Resources' Projects
Full Paper 262.ps, 262.pdf
Abstract Design issues of a spontaneous speech corpus is described. The corpus under compilation will contain 800-1000 hour spontaneously uttered Common Japanese speech and the morphologically annotated transcriptions. Also, segmental and intonation labeling will be provided for a subset of the corpus. The primary application domain of the corpus is speech recognition of spontaneous speech, but we plan to make it useful for natural language processing and phonetic/linguistic studies also.