LREC 2000 2nd International Conference on Language Resources & Evaluation  
Home Basic Info Archaeological Zappeion Registration Conference

Conference Papers

Program
Papers
Sessions
Abstracts
Authors
Keywords
Search

Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377.

List of all papers and abstracts.


Previous Paper   Next Paper  

Title Automatic Transliteration and Back-transliteration by Decision Tree Learning
Authors Kang Byung-Ju (Department of Computer Science Advanced Information Technology Research Center (AITrc) Korea Terminology Center for Language and Knowledge Engineering Korea Advanced Institute of Science and Technology 373-1 Kusong-dong, Yusong-gu, Taejon, 305-701, Korea, bjkang@world.kaist.ac.kr)
Choi Key-Sun (Korea Terminology Research Center for Language and Knowledge Engineering, Department of Computer Science, Korea Advanced Institute of Science and Technology, 373-1 Kusong-dong Yusong-gu Taejon 305-701 Korea, kschoi@korterm.kaist.ac.kr)
Keywords  
Session Session WP6 - Tools in the Written Area
Abstract Automatic transliteration and back-transliteration across languages with drastically different alphabets and phonemes inventories such as English/Korean, English/Japanese, English/Arabic, English/Chinese, etc, have practical importance in machine translation, cross-lingual information retrieval, and automatic bilingual dictionary compilation, etc. In this paper, a bi-directional and to some extent language independent methodology for English/Korean transliteration and back-transliteration is described. Our method is composed of character alignment and decision tree learning. We induce transliteration rules for each English alphabet and back-transliteration rules for each Korean alphabet. For the training of decision trees we need a large labeled examples of transliteration and back-transliteration. However this kind of resources are generally not available. Our character alignment algorithm is capable of highly accurately aligning English word and Korean transliteration in a desired way.