LREC 2000 2nd International Conference on Language Resources & Evaluation | |
Conference Papers and Abstracts
Papers and abstracts by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Papers and abstracts by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377. |
Named Entity Recognition in Greek Texts | In this paper, we describe work in progress for the development of a named entity recognizer for Greek. The system aims at information extraction applications where large scale text processing is needed. Speed of analysis, system robustness, and results accuracy have been the basic guidelines for the system’s design. Our system is an automated pipeline of linguistic components for Greek text processing based on pattern matching techniques. Non-recursive regular expressions have been implemented on top of it in order to capture different types of named entities. For development and testing purposes, we collected a corpus of financial texts from several web sources and manually annotated part of it. Overall precision and recall are 86% and 81% respectively. | |
NaniTrans: a Speech Labelling Tool | This paper deals with a description of NaniTrans, a tool for segmentation and labeling of speech. The tool is programmed to work on the MATLAB application interface, in any of the supported platforms (Unix, Windows, Macintosh). The tool has been designed to annotate large speech databases, which can be also partially preprocessed (but require manual supervision). It supports the definition of an environment of annotation: set of annotation levels (orthographic, phonetic, etc.), display mode (how to show information), graphic representation (waveform, spectrogram), keyboard short-cuts, etc. This configuration is then used on a speech database. A safe file locking system allows many annotators to work concurrently on the same speech database. The tool is very friendly and easy to use by non experienced annotators, and it is designed to optimize speed using both keyboard and mouse. New options or speech processing tools can be easily added by using any MATLAB or user defined function. | |
NL-Translex: Machine Translation for Dutch | NL-Translex is an MLIS-project which is funded jointly by the European Commission, the Dutch Language Union, the Ducth Ministry of Education, Culture and Science, the Dutch Ministry of Economic Affairs and the Flemish Institute for the Promotion of Scientific and Technological Research in Industry. The aim of this project is to develop Machine Translation components that will handle unrestricted text and translate Dutch from and into English, French and German. In addition to this practical aim, the partners in this project all have objectives relating to strategy, language policy and culture. The modules to be developed are intended primarily for use by EU institutions and the translation services of official bodies in the Member States. In this paper we describe in detail the aims and structure of the project, the user population, the available resources and the activities carried out so far, in particular the procedure followed for the call for tenders aimed at selecting a technonolgy provider. Finally, we describe the acceptance procedure, the strategic impact of the project and the dissemination plan. |