LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title ItalWordNet: a Large Semantic Database for Italian
Authors Roventini Adriana (Istituto di Linguistica Computazionale, CNR, Area della Ricerca di Pisa, Via Alfieri 1, Loc. S. Cataldo, Ghezzano 56010 (PI) – ITALY, adriana@ilc.pi.cnr.it)
Alonge Antonietta (Istituto di Linguistica Computazionale, CNR, Area della Ricerca di Pisa, Via Alfieri 1, Loc. S. Cataldo, Ghezzano 56010 (PI) – ITALY, antoalonge@libero.it)
Calzolari Nicoletta (Istituto di Linguistica Computazionale, CNR, Area della Ricerca di Pisa, Via Alfieri 1, Loc. S. Cataldo, Ghezzano 56010 (PI) – ITALY, glottolo@ilc.pi.cnr.it)
Magnini Bernardo (Istituto per la Ricerca Scientifica e Tecnologica, I-38050, Povo, Trento, magnini@irst.itc.it)
Bertagna Francesca (Consorzio Pisa Ricerche, Via S. Maria 40, Pisa 56100 - ITALY, F.Bertagna@ilc.pi.cnr.it)
Keywords Lexical Resources, Rexical Semantic Networks
Session Session WO11 - Mono-Multilingual Lexicon Acquisition and Building
Full Paper 129.ps, 129.pdf
Abstract The focus of this paper is on the work we are carrying out to develop a large semantic database within an Italian national project, SI-TAL, aiming at realizing a set of integrated (compatible) resources and tools for the automatic processing of the Italian language. Within SI-TAL, ItalWordNet is the reference lexical resource which will contain information related to about 130,000 word senses grouped into synsets. This lexical database is not being created ex novo, but extending and revising the Italian lexical wordnet built in the framework of the EuroWordNet project. In this paper we firstly describe how the lexical coverage of our wordnet is being extended by adding adjectives, adverbs and proper nouns, plus a terminological subset belonging to the economic and financial domain. The relevant changes involved by these extensions both in the linguistic model and in the data structure are then illustrated. In particular we discuss i) the new semantic relations identified to encode information on adjectives and adverbs ii) the new architecture including the terminological subset.