Title |
A Part-of-Speech-Based Search Algorithm for Translation Memories |
Authors |
Reinhard Rapp (University of Mainz, FASK 76711 Germersheim, Germany) |
Session |
WP1: Corpora & Corpus Tools |
Abstract |
The retrieval of related sentences in state-of-the-art translation memory systems is based on orthographic similarities. This often leads to poor search results, since orthographically similar sentences are not necessarily semantically related. In this paper we propose a search algorithm that aims to reduce this problem by taking part-of-speech information into account. It requires that the parallel sentences stored in the translation memory are processed using standard tools for word alignment and part-of-speech tagging. The work described is part of an ongoing project in example-based machine translation. |
Keywords |
Example-Based machine translation, Translation memory, Part-of-Speech tagging, Retrieval algorithm, Sentence similarity |
Full Paper |