Title

Automatic extraction of differences between spoken and written languages, and automatic translation from the written to the spoken language

Authors

Masaki Murata (Communications Research Laboratory, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0289, Japan)

Hitoshi Isahara (Communications Research Laboratory, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-0289, Japan)

Session

SP2: Speech Varieties And Multilingual ASR

Abstract

We extracted the differences between spoken language and written language from a spoken-language corpus and a written-language corpus by using the UNIX command ``diff'' and examined the differences to determine the construction of the grammars of the two corpora. We also transformed written-language sentences into spoken-language sentences by using rules based on the extracted differences.

Keywords

Differences between spoken and written languages, Automatic translation between the written and spoken languages, Diff

Full Paper

27.pdf