Title |
A Survey of Idiomatic Preposition-Noun-Verb Triples on Token Level |
Authors |
Fabienne Fritzinger, Marion Weller and Ulrich Heid |
Abstract |
Most of the research on the extraction of idiomatic multiword expressions (MWEs) focused on the acquisition of MWE types. In the present work we investigate whether a text instance of a potentially idiomatic MWE is actually used idiomatically in a given context or not. Inspired by the dataset provided by (Cook et al., 2008), we manually analysed 9,700 instances of potentially idiomatic prepositionnoun- verb triples (a frequent pattern among German MWEs) to identify, on token level, idiomatic vs. literal uses. In our dataset, all sentences are provided along with their morpho-syntactic properties. We describe our data extraction and annotation steps, and we discuss quantitative results from both EUROPARL and a German newspaper corpus. We discuss the relationship between idiomaticity and morpho-syntactic fixedness, and we address issues of ambiguity between literal and idiomatic use of MWEs. Our data show that EUROPARL is particularly well suited for MWE extraction, as most MWEs in this corpus are indeed used only idiomatically. |
Topics |
Corpus (creation, annotation, etc.), Typological databases, Tools, systems, applications |
Full paper |
A Survey of Idiomatic Preposition-Noun-Verb Triples on Token Level |
Slides |
A Survey of Idiomatic Preposition-Noun-Verb Triples on Token Level |
Bibtex |
@InProceedings{FRITZINGER10.728,
author = {Fabienne Fritzinger and Marion Weller and Ulrich Heid}, title = {A Survey of Idiomatic Preposition-Noun-Verb Triples on Token Level}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |