Title

Title	Japanese MULTEXT: A Prosodic Corpus
Author(s)	Kitazawa Shigeyoshi (1), Kiriyama Shinya (1), Itoh Toshihiko (1), Nick Campbell (2) (1) Department of Computer Science, Faculty of Information, Shizuoka University; (2) ATR Human Information Science Research Labs
Session	P27-SE
Abstract	A prosodic corpus of Japanese was developed as a scheduled project by the university researchers in Japan. This paper describes the contents of the corpus: speakers, speaking style, recording conditions, prosodic annotations. The corpus is a Japanese version of the MULTEXT prosodic database of EUROM1. We adopted a J-ToBI prosodic labeling scheme as well as additional labels such as pitich range, prominence, devoicing, and nasalization. We developed an automatic generation of J-ToBI labels. It was proved that 71.6% of tone labels were placed on the correct positions with the correct symbols, and that 73.7% of BI labels were generated correctly. Automatic prosodic label generator was evaluated by expert labeler team and beginner team and found to be helpful for both of them.
Keyword(s)	MULTEXT, J-ToBI, Japanese, Prosodic Corpus
Language(s)	Japanese
Full Paper	774.pdf