LREC 2000 2nd International Conference on Language Resources & Evaluation | ||||||
Title | Spontaneous Speech Corpus of Japanese |
Authors | Maekawa Kikuo (7KH National Language Research Institute, 3-9-14 Nishiga’oka, Kita-ku, Tokyo 115-8620 Japan, kikuo@kokken.go.jp) Koiso Hanae (7KH National Language Research Institute, 3-9-14 Nishiga’oka, Kita-ku, Tokyo 115-8620 Japan, koiso@kokken.go.jp) Furui Sadaoki (Tokyo Institute of Technology, 2-12-1, Ookayama, Meguro-ku, Tokyo 152-8552 Japan, furui@furui.cs.titech.ac.jp) Isahara Hitoshi (Communications Research Laboratory, 588-2, Iwaoka, Nishi-ku, Kobe 651-2401 Japan, isahara@crl.go.jp) |
Keywords | Intonaion Labeling, Japanese, Speech Recognition, Spontaneous Speech |
Session | Session SP3 - Spoken Language Resources' Projects |
Full Paper | 262.ps, 262.pdf |
Abstract | Design issues of a spontaneous speech corpus is described. The corpus under compilation will contain 800-1000 hour spontaneously uttered Common Japanese speech and the morphologically annotated transcriptions. Also, segmental and intonation labeling will be provided for a subset of the corpus. The primary application domain of the corpus is speech recognition of spontaneous speech, but we plan to make it useful for natural language processing and phonetic/linguistic studies also. |