Title | Japanese MULTEXT: A Prosodic Corpus |
Author(s) |
Kitazawa Shigeyoshi (1), Kiriyama Shinya (1), Itoh Toshihiko (1), Nick Campbell (2)
(1) Department of Computer Science, Faculty of Information, Shizuoka University; (2) ATR Human Information Science Research Labs |
Session | P27-SE |
Abstract | A prosodic corpus of Japanese was developed as a scheduled project by the university researchers in Japan. This paper describes the contents of the corpus: speakers, speaking style, recording conditions, prosodic annotations. The corpus is a Japanese version of the MULTEXT prosodic database of EUROM1. We adopted a J-ToBI prosodic labeling scheme as well as additional labels such as pitich range, prominence, devoicing, and nasalization. We developed an automatic generation of J-ToBI labels. It was proved that 71.6% of tone labels were placed on the correct positions with the correct symbols, and that 73.7% of BI labels were generated correctly. Automatic prosodic label generator was evaluated by expert labeler team and beginner team and found to be helpful for both of them. |
Keyword(s) | MULTEXT, J-ToBI, Japanese, Prosodic Corpus |
Language(s) | Japanese |
Full Paper | 774.pdf |