LREC 2000 - Papers

LREC 2000 2^nd International Conference on Language Resources & Evaluation

Conference Papers

Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377.

List of all papers and abstracts.

Previous Paper Next Paper

Title An Architecture for Document Routing in Spanish: Two Language Components, Pre-processor and Parser

Authors Rojo Guillermo (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fegrojo@usc.es)
Alvarez Maria Concepcion (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, femcal@usc.es)
Alvarino Pilar (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fepili@usc.es)
Gil Adelaida (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, iagilma@usc.es)
Santalla Maria Paula (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fempsr@usc.es)
Sotelo Susana (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fesdocio@usc.es)

Keywords Document Routing, Information Retrieval, Parsing, Syntactic Normalization

Session Session WO9 - Applications in the Written Area

Abstract This paper describes the language components of a system for Document Routing in Spanish. The system identifies relevant terms for classification within involved documents by means of natural language processing techniques. These techniques are based on the isolation and normalization of syntactic unities considered relevant for the classification, especially noun phrases, but also other constituents built around verbs, adverbs, pronouns or adjectives. After a general introduction about the research project, the second Section relates our approach to the problem with other previous and current approaches, the third one describes corpora used for evaluating the system. The linguistic analysis architecture, including pre-processing and two different levels of syntactic analysis, is described in following fourth and fifth Sections, while the last one is dedicated to a comparative analysis of results obtained from the processing of corpora introduced in third Section. Certain future developments of the system are also included in this Section.

ce="Verdana">

Title	An Architecture for Document Routing in Spanish: Two Language Components, Pre-processor and Parser
Authors	Rojo Guillermo (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fegrojo@usc.es) Alvarez Maria Concepcion (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, femcal@usc.es) Alvarino Pilar (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fepili@usc.es) Gil Adelaida (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, iagilma@usc.es) Santalla Maria Paula (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fempsr@usc.es) Sotelo Susana (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fesdocio@usc.es)
Keywords	Document Routing, Information Retrieval, Parsing, Syntactic Normalization
Session	Session WO9 - Applications in the Written Area
Abstract	This paper describes the language components of a system for Document Routing in Spanish. The system identifies relevant terms for classification within involved documents by means of natural language processing techniques. These techniques are based on the isolation and normalization of syntactic unities considered relevant for the classification, especially noun phrases, but also other constituents built around verbs, adverbs, pronouns or adjectives. After a general introduction about the research project, the second Section relates our approach to the problem with other previous and current approaches, the third one describes corpora used for evaluating the system. The linguistic analysis architecture, including pre-processing and two different levels of syntactic analysis, is described in following fourth and fifth Sections, while the last one is dedicated to a comparative analysis of results obtained from the processing of corpora introduced in third Section. Certain future developments of the system are also included in this Section.