SUMMARY : Session O8-W Lexicon Acquisition


Title Automatic extraction of subcategorization frames for French
Authors P. Chesley, S. Salmon-alt
Abstract This paper describes the automatic extraction of French subcategorization frames from corpora. The subcategorization frames have been acquired via VISL, a dependency-based parser (Bick 2003), whose verb lexicon is currently incomplete with respect to subcategorization frames. Therefore, we have implemented binomial hypothesis testing as a post-parsing filtering step. On a test set of 104 frequent verbs we achieve lower bounds on type precision at 86.8% and on token recall at 54.3%. These results show that, contra (Korhonen et al. 2000), binomial hypothesis testing can be robust for determining subcategorization frames given corpus data. Additionally, we estimate that our extracted subcategorization frames account for 85.4% of all frames in French corpora. We conclude that using a language resource, such as the VISL parser, with a currently unevaluated (and potentially high) error rate can yield robust results in conjunction with probabilistic filtering of the resource output.
Keywords Subcategorization frames, verbal lexicon,
Full paper Automatic extraction of subcategorization frames for French