Title

Integrating Spanish Linguistic Resources in a Web Site Assistant

Authors

Paloma Martínez (Universidad Carlos III de Madrid Avd. Universidad 30, 28911 Leganés, Madrid, Spain)

Ana García-Serrano (Universidad Politécnica de Madrid Campus de Montegancedo s/n, 28660 Boadilla del Monte, Madrid, Spain)

Alberto Ruiz-Cristina (Universidad Politécnica de Madrid Campus de Montegancedo s/n, 28660 Boadilla del Monte, Madrid, Spain)

Session

WP3: Tools & Components

Abstract

This work describes a proposal to improve web document retrieval by facing the main problems in document searching: first, traditional web search engines miss documents that are relevant to the user query and retrieve many that are not. Second, the query  formulation is not as accessible as it could be, and some users have difficulties in expressing boolean queries. To improve the quality of Internet search engines, two main approaches have typically been adopted: One is the creation of a metasearch engine that makes use of multiple search engines by unifying both the query language and the type of results returned by the different search engines; the other one involves applying NLP techniques for query extensions in order to handle morphological, lexical, semantic and syntactic variations. Focusing on the second approach, we present the research project MESIA (project CAM 07T/0017/1998) for the Madrid Local Government web site (www.comadrid.es). Its main goal is to exploit general purpose linguistic resources to extend user queries in order to enhance the answers provided by AltaVista search engine.

Keywords

Spanish linguistic resources

Full Paper

131.pdf