LREC 2000 2nd International Conference on Language Resources & Evaluation  
Home Basic Info Archaeological Zappeion Registration Conference

Conference Papers

Program
Papers
Sessions
Abstracts
Authors
Keywords
Search

Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377.

List of all papers and abstracts.


Previous Paper   Next Paper  

Title Automatic Extraction of English-Chinese Term Lexicons from Noisy Bilingual Corpora
Authors Le Sun (Open Systems & Chinese Information Processing Center, Institute of Software, Chinese Academy of Sciences, Beijing 100080, P. R. China., lesun@sonata.iscas.ac.cn)
Youbing Jin (Open Systems & Chinese Information Processing Center, Institute of Software, Chinese Academy of Sciences, Beijing 100080, P. R. China., ybjin@sonata.iscas.ac.cn)
Lin Du (Open Systems & Chinese Information Processing Center, Institute of Software, Chinese Academy of Sciences, Beijing 100080, P. R. China., ldu@sonata.iscas.ac.cn)
Yufang Sun (Open Systems & Chinese Information Processing Center, Institute of Software, Chinese Academy of Sciences, Beijing 100080, P. R. China., yfsun@sonata.iscas.ac.cn)
Keywords Bilingual Corpora Processing, Sentence Alignment, Term Extraction
Session Session WO11 - Mono-Multilingual Lexicon Acquisition and Building
Abstract This paper describes our system, which is designed to extract English-Chinese term lexicons from noisy complex bilingual corpora and use them as translation lexicon to check sentence alignment results. The noisy bilingual corpora are aligned firstly by our improved length based statistical approach, which could detect sentence omission and insertion partly. A term extraction system is used to obtain term translation lexicons form roughly aligned corpora. Then the statistical approach is used to align the corpora again. Finally, we filter the noisy bilingual texts and obtain nearly perfect alignment corpora.

 

rdana">