LREC 2000 2nd International Conference on Language Resources & Evaluation | |
Conference Papers
Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377. |
Previous Paper Next Paper
Title | Language Resources Development at the Spanish Royal Academy |
Authors |
Municio Angel Martin (Real Academia Espanola Felipe IV 4, 28014 Madrid, Spain, email: amunicio@rae.es) Rojo Guillermo (Dept. of Spanish Language, University of Santiago de Compostela, Burgo das Nacions, s/n., E-15771 Santiago de Compostela, Spain, fegrojo@usc.es) Sanchez Leon Fernando (Real Academia Espanola Felipe IV 4, 28014 Madrid, Spain, email: fsanchez@rae.es) Pinillos Octavio (Real Academia Espanola Felipe IV 4, 28014 Madrid, Spain, email: pinillos@rae.es) |
Keywords | Corpus, Grammars, Lexicography, Lexicon, Morphological Analysis, NLP Tools, Spanish, Spoken Corpus |
Session | Session WO15 - Language Resources Projects |
Abstract | This paper explains some of the most relevant issues concerning the development of language resources at the Spanish Royal Academy. Two 125-M words corpus of Spanish language (synchronic and diachronic) and three specialized corpus has been developed. Around the corpus, RAE is also developing NLP tools and resources to morpho-syntactically annotate them. Some of the most relevant are: The Computational Lexicon, the Morphological analysis tools, the Disambiguation grammars and the Tokenizer generator. The last section describes the lexicographic use of corpus materials and includes a brief description of the Corpus-based lexicographical workbench and his related tools. |