Title |
Webaffix: Discovering Morphological Links on the WWW |
Authors |
Nabil Hathout (ERSS / CNRS & UniversitŽe de Toulouse Le Mirail - France 5, allŽees A. Machado, F-31058 Toulouse CEDEX 1) Ludovic Tanguy (ERSS / CNRS & UniversitŽe de Toulouse Le Mirail - France 5, allŽees A. Machado, F-31058 Toulouse CEDEX 1) |
Session |
WP5: Components & Systems |
Abstract |
This paper presents a new language-independent method for finding morphological links between newly appeared words (i.e. absent from reference word lists). Using the WWW as a corpus, the Webaffix tool detects the occurrences of new derived lexemes based on a given suffix, proposes a base lexeme following a standard scheme (such as noun-verb), and then performs a compatibility test on the word pairs produced, using the Web again, but as a source of cooccurrences. The resulting pairs of words are used to build generic morphological databases useful for a number of NLP tasks. We develop and comment an example use of Webaffix to find new noun/verb pairs in French. |
Keywords |
Morphological links |
Full Paper |