Title |
Live Lexicons and Dynamic Corpora Adapted to the Network Resources for Chinese Spoken Language Processing Applications in an Internet Era |
Authors |
Chien Lee-Feng (Institute of Information Science, Academia Sinica, Taipei, Taiwan, Republic of China, lfchien@iis.sinica.edu.tw) Lee Lin-Shan (Institute of Information Science, Academia Sinica, Dept. of Electrical Engineering, National Taiwan University, Taipei, Taiwan, Republic of China, lsl@iis.sinica.edu.tw) |
Keywords |
Dynamic Corpora, Internet, Live Lexicon |
Session |
Session SP3 - Spoken Language Resources' Projects |
Full Paper |
214.ps, 214.pdf |
Abstract |
In the future network era, huge volume of information on all subject domains will be readily available via the network. Also, all the network information are dynamic, ever-changing and exploding. Furthermore, many of the spoken language processing applications will have to do with the content of the network information, which is dynamic. This means dynamic lexicons, language models and so on will be required. In order to cope with such a new network environment, automatic approaches for the collection, classification, indexing, organization and utilization of the linguistic data obtainable from the networks for language processing applications will be very important. On the one hand, high performance spoken language technology can hopefully be developed based on such dynamic linguistic data on the network. On the other hand, it is also necessary that such spoken language technology can be intelligently adapted to the content of the dynamic and the ever-changing network information. Some basic concept for live lexicons and dynamic corpora adapted to the network resources has been developed for Chinese spoken language processing applications and briefly summarized here in this paper. Although the major considerations here are for Chinese language, the concept may equally apply to other languages as well. |