Workshop on Language Resources for European Minority Languages

Granada, Spain - May 27 1998 (morning)

Call for Papers

This workshop will be held in conjunction with the International Conference on Language Resources and Evaluation (LREC), Granada, Spain: May 28-30, 1998. The workshop will provide a forum for researchers working on the development of speech and language resources for the indigenous minority languages of Europe.

Workshop scope and aims

The minority or "lesser used" languages of Europe (e.g. Basque, Welsh, Breton) are under increasing pressure from the major languages. Some of them (e.g. Gaelic) are becoming endangered, but others (e.g. Catalan) are in a stronger position, with a certain amount of official recognition and funding. However, the situation with regard to language resources is fragmented and disorganised. Some minority languages have been adequately researched linguistically, but most have not, and the vast majority do not yet possess basic speech and language resources (such as text and speech corpora) which are sufficient to permit commercial development of products.

If this situation were to continue, the minority languages of Europe would fall a long way behind the major languages, as regards the availability of commercial speech and language products. This in turn will accelerate the decline of those languages that are already struggling to survive, as speakers are forced to use the majority language for interaction with these products. To break this vicious circle, it is important to encourage the development of basic language resources.

The workshop is a very small first step towards encouraging the development of such resources. The aim is to share information, so that isolated researchers will not need to start from nothing. An important aspect will be the forming of personal contacts, which at present do not exist. The aim is to make it easier for isolated researchers with little funding and no existing corpora to begin developing a usuable speech or text database. There will be a balance between presentations of existing language resources, and more general presentations designed to give background information.

Technical areas covered will include:

Papers are invited that will describe existing speech and language resources for minority languages (speech databases, text databases, and lexicons), also papers based on the analysis of these resources. Presentations will last 20 minutes each. All presentations will be given in English, since it cannot be assumed that each listener will speak all the minority languages discussed.


Briony Williams University of Edinburgh, Scotland, UK
Climent Nadeu Universitat Politecnica de Catalunya, Catalunya, Spain
Alex Monaghan Dublin City University, Ireland

Paper submission

Papers should not exceed 4000 words or 10 pages. They can be submitted in one of two ways: hard copy or electronic submission. They should be in A4 size and in English.

  1. Hard copies: Three hard copies should be sent to:
    Dr. Briony Williams
    80 South Bridge
    Edinburgh EH1 1HN
    Scotland, UK

    Please also send an email to Briony Williams ( informing her of the hard copy submission. This is in case the hard copy does not reach its destination. This email should contain the information specified in the section below.

  2. Electronic submission: Electronic submission may be in self-contained Latex, Postcript or MS-Word format. Submissions should be sent to An electronic submission should be accompanied by a plain ascii text email message giving the following details:
    #NAMEName of first author
    #TITLE Title of the paper
    #PAGES Number of pages
    #NOTEAny relevant instructions about the format etc.
    #ABSTRAbstract of the paper
    #EMAILEmail address of the first author
    #ADDRPostal address of the first author
    #TELTelephone number of the first author
    #FAXFax number of the first author

Important dates

Paper submission deadline February 27
Paper notificationMarch 27
Camera-ready papers due April 15
WorkshopMay 27

Conference information

General information about the main conference is at the web page


            "Language Resources for European Minority Languages"

              Wednesday May 27 1998 (morning), Granada, Spain
           In association with the First International Conference on
        Language Resources and Evaluation, May 28-30 1998, Granada, Spain

 8:00  Registration

 8:30  Welcome and Introduction

 8:40  "Overview of minority languages in Europe".  Marc Alemany (Catalan
       Sociolinguistic Institute).

 9:00  "VOCATEL and VOGATEL: Two Telephone Speech Databases of Spanish Minority
       Languages (Catalan and Galician)".  Luis Villarrubia, Paloma Leon, Luis
       Hernandez (Speech Technology Group, Telefonica I&D, Madrid, Spain);
       Climent Nadeu, Ignasi Esquerra, Javier Hernando (Dept. TSC, Universitat
       Politècnica de Catalunya, Barcelona, Spain); Carmen Garcia-Mateo, Laura
       Docio (ETSIT de Telecomunicación, Universidad de Vigo, Vigo, Spain).

 9:20  "Written Linguistic Resources in Catalan: the DCC Project". Joan Soler
       Bou (Institut d'Estudis Catalans, Barcelona, Spain).

 9:40  "The MELIN project". Donncha Ó Cróinín (Institiúid Teangeolaíochta
       Éireann/Linguistics Institute of Ireland, Dublin, Ireland).

10:00  COFFEE

10:30  "A framework for the automatic processing of Basque". I. Aldezabal,
       O. Ansa, J.M. Arriola, A. Díaz de Ilarraza, N. Ezeiza, A. Maritxalar,
       M. Oronoz, K. Sarasola (Euskal Herriko Unibertsitatea, Spain); I.
       M. Urkia (UZEI, Donostia, Spain).

10:50  "Towards the creation of new Galician language resources: From a printed
       dictionary to the Galician WordNet".  Fernando Magan (Ramón Piñeiro
       Research Center for Humanities, Santiago de Compostela, Spain).

11:10  Poster Session 1  (odd-numbered authors at posters)

11:50  Poster Session 2  (even-numbered authors at posters)

12:30  Plenary

13:30  End


                             Poster papers

 1  "A tagger environment for Galician". M. Vilares, J. Graña (Universidad de
    Corunna, Spain); T. Araujo, D. Cabrero, I. Diz (Ramón Piñeiro Research
    Center for Humanities, Santiago de Compostela, Spain).

 2  "A bilingual Spanish-Catalan database of units for concatenative
    I. Esquerra, A. Bonafonte, F. Vallverdú, A. Febrer (Universitat 
    Politècnica de Catalunya, Barcelona, Spain).

 3  "Methods and tools for building the Catalan WordNet". L. Benítez, S. 
    Cervell, G. Escudero, M. López, G. Rigau, M. Taulé (Universitat Politècnica
    de Catalunya, Barcelona, Spain; Universitat de Barcelona).

 4  "Lemmatisation of the corpus of Cornish". J. Mills (University of Luton,
    England, UK).

 5  "SpeechDat Cymru: A large-scale telephony Welsh database". R.J. Jones,
    J.S. Mason (Univ. of Wales, Swansea, Wales, UK); L. Helliker, M. Pawlewski
    (BT Labs, Ipswich, England, UK).

 6  "KGB Project: Tools and resources for Breton language learning". J. Siroux,
    H. Gourmelon, G. Mercier, J-P. Messager (ENSSAT, Lannion, France).

 7  "A speech database in Basque language". K. López de Ipiña, I. Torres,
    L. Oñederra (Euskal Herriko Unibertsitatea, Spain).

 8  "An overview of the existing language resources for 'Gallego'".
    C. García-Mateo (Universidade de Vigo, Spain); M. González-González
    (Universidade de Santiago, Spain).

 9  "Language standardisation and linguistic resources: The case of Central
    Ladin (Dolomites)". F. Ciochetti (Istitut Ladin, Vigo di Fassa, Italy);
    F. Pianesi (IRST, Trento, Italy).

10  "The LE-PAROLE project and the National Corpus of Irish". D. Ó Cróinín,
    E. Uí Dhonnchadha (Institiúid Teangeolaíochta Éireann/Linguistics Institute
    of Ireland, Dublin, Ireland).

11  "Design of a phonetic corpus for speech recognition in Catalan".  I.
    Esquerra, C. Nadeu (Universitat Politécnica de Catalunya, Barcelona,
    L. Villarrubia (Telefónica Investigación y Desarrollo, Madrid, Spain).

12  "Levels of annotation for a Welsh speech database for phonetic research".
    B. Williams (University of Edinburgh, Scotland, UK).


Specific queries about the conference should be directed to:

LREC Secretariat
Facultad de Traduccion e Interpretacion
Dpto. de Traduccion e Interpretacion
C/ Puentezuelas, 55
18002 Granada, SPAIN
Tel: +34 58 24 41 00 - Fax: +34 58 24 41 04