Title |
Making Text Resources Accessible to the Reader: the Case of Patent Claims |
Authors |
Simon Mille and Leo Wanner |
Abstract |
Hardly any other kind of text structures is as notoriously difficult to read as patents. This is first of all due to their abstract vocabulary and their very complex syntactic constructions. Especially the claims in a patent are a challenge: in accordance with international patent writing regulations, each claim must be rendered in a single sentence. As a result, sentences with more than 200 words are not uncommon. Therefore, paraphrasing of the claims in terms the user can understand is of high demand. We present a rule-based paraphrasing module that realizes paraphrasing of patent claims in English as a rewriting task. Prior to the rewriting proper, the module implies the stages of simplification and discourse and syntactic analyses. The rewriting makes use of a full-fledged text generator and consists in a number of genuine generation tasks such as aggregation, selection of referring expressions, choice of discourse markers and syntactic generation. As generator, we use the MATE-work bench, which is based on the Meaning-Text Theory of linguistics. |
Language |
Single language |
Topics |
Tools, systems, applications, Paraphrasing, Controlled languages |
Full paper |
Making Text Resources Accessible to the Reader: the Case of Patent Claims |
Slides |
- |
Bibtex |
@InProceedings{MILLE08.352,
author = {Simon Mille and Leo Wanner},
title = {Making Text Resources Accessible to the Reader: the Case of Patent Claims},
booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
year = {2008},
month = {may},
date = {28-30},
address = {Marrakech, Morocco},
editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
publisher = {European Language Resources Association (ELRA)},
isbn = {2-9517408-4-0},
note = {http://www.lrec-conf.org/proceedings/lrec2008/},
language = {english}
} |