Minimizing the Effort for Language Resource Acquisition

Granada, Spain, 26 May, 1998

in conjunction with

The First International Conference
on Language Resources and Evaluation

Granada, Spain, 28-30, May 1998

An applied NLP system must produce adequate results and must be made deployable within reasonable time. Gathering and acquiring language resources to build an application system is very time-consuming, and it is imperative to find ways of speeding up acquisition of high quality, useful static knowledge sources such as a variety of grammars, lexicons, corpora, etc. Viability of avoiding massive resource acquisition, if possible, must also be carefully considered.

Resource acquisition should include methods, based both on sound theoretical principles and practical experience, of deciding, among other things, on the amount of knowledge one *really* needs for a given application. Increasing the size of knowledge sources or their number and variety does not necessarily lead to a commensurate improvement of output quality in an application, though a correlation between the two certainly exists, but it definitely needs to much increased costs.

No matter how large the acquired resources are and how many of them have been acquired, there will always remain a residue of language processing problems which can be tackled only by foregoing the requirement of full automation and involving expensive semi-automatic or even manual acquisition. It becomes imperative, therefore, to assess when the static knowledge source acquisition is NO LONGER PROFITABLE. Thus, in a system for interactive authoring and automatic generation of patent claim texts, the lexical knowledge base can be restricted to a lexicon of domain-related verbs marked for subcategorization (as the nominals are provided interactively by the author).


The technological issues to be discussed at the conference include, BUT ARE NOT LIMITED TO:

We particularly encourage reports about actual practical large-scale resource acquisition efforts in which economy of effort has been a conscious choice.

Organizing Committee:

Svetlana Sheremetyeva, NMSU CRL, USA (Chair)
Eduard Hovy, USC ISI, USA
Bernardo Magnini, IRST, Italy
Sergei Nirenburg, NMSU CRL, USA
Victor Raskin, Purdue University, USA
Frederique Segonde, Xerox Research Centre Europe, France
Leo Wanner, University of Stuttgart, Germany


Papers should not exceed 4000 words or 10 pages. Presentations will be selected on the basis of a review of papers and project reports.


Each submission should include a title page containing the title, author(s), affiliation(s), submitting author's mailing address, telephone number, fax number and e-mail address.

The authors may submit three hard copies OR submit ELECTRONICALLY in postscript form to:

Svetlana Sheremetyeva
Computing Research Laboratory
New Mexico State University, USA
Box30001/Dept.3CRL/Las Cruces
New Mexico 88003-8001

Receipt of submissions will be acknowledged.


Thursday, February 19, 1998 Submissions due
Monday, March, 16 1998 Acceptances and rejections sent to authors
Friday, April 10 1998 Final papers due
Tuesday, May 26, 1998 Workshop date

Registration for the workshop will be:

10,000 pesetas for those not attending LREC
5,000 pesetas for those attending LREC

These fees will include a coffee break and the proceedings of the workshop.

Participation in the workshop will be limited by the venue. Requests for participation will be processed on the first come first served basis.

The updated  program of the LREC workshop  "Minimizing the
Effort for the Language Resource Acquisition". May,26

Svetlana Sheremetyeva, Organizer

2.40-2.50 Introduction. S. Sheremetyeva

2.50-3.25 Reusing Swedish Language Processing Resources in SVENSK
          F.Olsson, B.Gamback and M.Eriksson

3.25-4.00 Using an English Corpus to Produce a Balanced Bi-lingual Lexicon
          J.Cowie, J.Longwell, C. Keller

4.00-4.30 Two Experiments on Balancing the Acquisition Effort and
          Automation Level with Needs of an Application
          S.Sheremetyeva and S.Nirenburg  

4.30-5.00 Break

5.00-5.35 Speeding-up the Building of New Ontologies using Bilingual

5.35-6.10 Minimization Strategies in NeuroTran
          N.Koncar, S.Pawlowski, D.Sipka and V.Sipka

6.10-6.45 A Cost-Effective Approach to Multilingual Lexicon Acquisition
          E.Viegas, S.Nirenburg, B.Onyshkevych and V.Raskin

6.45-7.30 General Discussion.