Title

Reusable Lexical Representations for Idioms

Author(s)

Jan Odijk

UIL-OTS

Session

O24-TW

Abstract

In this paper I introduce (1) a technically simple and highly theory-independent way for lexically representing flexible idiomatic expressions, and (2) a procedure to incorporate these lexical representations in a wide variety of NLP systems. The method is based on Structural EQuivalence Classes for Idioms and therefore called the SEQCI method. I illustrate the approach using the Rosetta MT system as an example of an NLP system. I discuss the advantages and some possible objections to the method. I conclude that the method is a good candidate for a standard for the lexical representation of idioms. The method also has the potential to be used for multi-word expressions other than idioms.

Keyword(s)

Multiword Expressions, idioms, standards, lexicons

Language(s) Dutch, English
Full Paper

116.pdf