Summary of the paper

Title A Resource-light Approach to Phrase Extraction for English and German Documents from the Patent Domain and User Generated Content
Authors Julia Maria Schulz, Daniela Becks, Christa Womser-Hacker and Thomas Mandl
Abstract In order to extract meaningful phrases from corpora (e. g. in an information retrieval context) intensive knowledge of the domain in question and the respective documents is generally needed. When moving to a new domain or language the underlying knowledge bases and models need to be adapted, which is often time-consuming and labor-intensive. This paper adresses the described challenge of phrase extraction from documents in different domains and languages and proposes an approach, which does not use comprehensive lexica and therefore can be easily transferred to new domains and languages. The effectiveness of the proposed approach is evaluated on user generated content and documents from the patent domain in English and German.
Topics Information Extraction, Information Retrieval, Multilinguality, Tools, systems, applications
Full paper A Resource-light Approach to Phrase Extraction for English and German Documents from the Patent Domain and User Generated Content
Bibtex @InProceedings{SCHULZ12.466,
  author = {Julia Maria Schulz and Daniela Becks and Christa Womser-Hacker and Thomas Mandl},
  title = {A Resource-light Approach to Phrase Extraction for English and German Documents from the Patent Domain and User Generated Content},
  booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)},
  year = {2012},
  month = {may},
  date = {23-25},
  address = {Istanbul, Turkey},
  editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {978-2-9517408-7-7},
  language = {english}
 }
Powered by ELDA © 2012 ELDA/ELRA