LREC 2000 2nd International Conference on Language Resources & Evaluation  
Home Basic Info Archaeological Zappeion Registration Conference

Conference Papers

Program
Papers
Sessions
Abstracts
Authors
Keywords
Search

Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377.

List of all papers and abstracts.


Previous Paper   Next Paper  

Title An Architecture for Document Routing in Spanish: Two Language Components, Pre-processor and Parser
Authors Rojo Guillermo (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fegrojo@usc.es)
Alvarez Maria Concepcion (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, femcal@usc.es)
Alvarino Pilar (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fepili@usc.es)
Gil Adelaida (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, iagilma@usc.es)
Santalla Maria Paula (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fempsr@usc.es)
Sotelo Susana (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fesdocio@usc.es)
Keywords Document Routing, Information Retrieval, Parsing, Syntactic Normalization
Session Session WO9 - Applications in the Written Area
Abstract This paper describes the language components of a system for Document Routing in Spanish. The system identifies relevant terms for classification within involved documents by means of natural language processing techniques. These techniques are based on the isolation and normalization of syntactic unities considered relevant for the classification, especially noun phrases, but also other constituents built around verbs, adverbs, pronouns or adjectives. After a general introduction about the research project, the second Section relates our approach to the problem with other previous and current approaches, the third one describes corpora used for evaluating the system. The linguistic analysis architecture, including pre-processing and two different levels of syntactic analysis, is described in following fourth and fifth Sections, while the last one is dedicated to a comparative analysis of results obtained from the processing of corpora introduced in third Section. Certain future developments of the system are also included in this Section.

 

ce="Verdana">