Title | Enriching a Thai Lexical Database with Selectional Preferences |
Author(s) |
Canasai Kruengkrai, Thatsanee Charoenporn, Virach Sornlertlamvanich, Hitoshi Isahara
Thai Computational Linguistics Laboratory, Communications Research Laboratory, 112 Paholyothin Road, Klong 1, Klong Luang, Pathumthani 12120, {canasai,thatsanee,virach}@crl-asia.org, isahara@crl.go.jp |
Session | O45-STW |
Abstract | A statistical corpus-based approach for acquiring selectional preferences of verbs is proposed. By parsing through text corpora, we obtain examples of context nouns that are considered to be the selectional preferences of a given verb. The approach is to generalize initial noun classes to the most appropriate levels on a semantic hierarchy. We present an iterative algorithm for generalization by combining an agglomerative merging and a model selection technique called the Bayesian Information Criterion (BIC). In our experiments, we consider the Web as the large corpora. We also propose approaches for extracting examples from the Web. Preliminarily experimental results are given to show the feasibility and effectiveness of our approach. |
Keyword(s) | Lexical Database, Thai Lexical Database, Selectional Preferences, Bayesian Information Criterion |
Language(s) | Thai, English |
Full Paper | 698.pdf |