Title |
Term and Collocation Extraction by Means of Complex Linguistic Web Services |
Authors |
Ulrich Heid, Fabienne Fritzinger, Erhard Hinrichs, Marie Hinrichs and Thomas Zastrow |
Abstract |
We present a web service-based environment for the use of linguistic resources and tools to address issues of terminology and language varieties. We discuss the architecture, corpus representation formats, components and a chainer supporting the combination of tools into task-specific services. Integrated into this environment, single web services also become part of complex scenarios for web service use. Our web services take for example corpora of several million words as an input on which they perform preprocessing, such as tokenisation, tagging, lemmatisation and parsing, and corpus exploration, such as collocation extraction and corpus comparison. Here we present an example on extraction of single and multiword items typical of a specific domain or typical of a regional variety of German. We also give a critical review on needs and available functions from a user's point of view. The work presented here is part of ongoing experimentation in the D-SPIN project, the German national counterpart of CLARIN. |
Topics |
Lexicon, lexical database, MultiWord Expressions & Collocations, LR Infrastructures and Architectures |
Full paper |
Term and Collocation Extraction by Means of Complex Linguistic Web Services |
Slides |
Term and Collocation Extraction by Means of Complex Linguistic Web Services |
Bibtex |
@InProceedings{HEID10.363,
author = {Ulrich Heid and Fabienne Fritzinger and Erhard Hinrichs and Marie Hinrichs and Thomas Zastrow}, title = {Term and Collocation Extraction by Means of Complex Linguistic Web Services}, booktitle = {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)}, year = {2010}, month = {may}, date = {19-21}, address = {Valletta, Malta}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias}, publisher = {European Language Resources Association (ELRA)}, isbn = {2-9517408-6-7}, language = {english} } |