The workshop addresses one of the key challenges in the development of NLP applications: bridging a generic framework with domain/task specific requirements. The issue reduces to the problem of customization of linguistic software and the degree to which this effort can be limited once a given component (i.e., grammar, lexicon, thesaurus, interpreter, tagger, etc.) is used across applications and across domains.
As natural language products play an increasingly more prominent role in the market for knowledge management, information extraction, search and navigation, the suppliers of technologies still struggle with the ability to produce high-quality software which can be deployed to new domains and relatively similar tasks in a short time within budget.
The issue of genericity versus specificity plays out in a number of areas; the key to solving the problem is that of identifying whether there is a particular locus to the dilemma: the architecture of the system, the language resources, the application components.
The problem of customization directly affects the architecture, the development process as well as the evaluation benchmarks for NLP systems. Although the notion of "knowledge bottleneck" has been essentially attached to NLP systems relying on knowledge representation strategies, even statistical NLP systems suffer from a customization problem.
The goal of the workshop is to emphasize the tension as well as the potential for crossfertilization between knowledge-based and corpus-based approaches to customization. While both approaches are needed, the key issue is how to reconcile potential contrasts and define the optimal balance between the two in order to maximise benefits for the content creation/management/delivery applications of focus, e.g. categorization, search, navigation, retrieval, extraction, personalization, generation etc.
The workshop aims at bringing together people from both academia and industry to address the variety of topics in the areas of customization, knowledge representation and acquisition, and metrics for measuring complexity. We invite submissions of papers in all areas of customization of NLP components, including, but not limited to, the following topics:
Papers should be submitted electronically to r-knippen@attglobal.net and should be in Word or postscript format. Papers should be no longer than 3,000 words, including the abstract. Contributors should also provide their affiliation and email contact.
Presentations will be allotted 20 minutes, followed by a 10 minute discussion.
Upon notification of acceptance, authors will be provided with the LREC stylesheet and make any necessary reformatting for the camera-ready version to be published in the proceedings.
Deadline for workshop submission | 25th February 2002 |
Notification of acceptance | 15th March 2002 |
Final version of paper for workshop proceedings | 8th April 2002 |
Workshop | 28th May 2002 (morning session) |
Federica Busa | Webegg | federica_busa@yahoo.com |
Evelyne Viegas | Microsoft Corporation | evelynev@microsoft.com |
Antonio Sanfilippo | Sra International | antonio_sanfilippo@sra.com |
Robert Knippen | LingoMotors Inc. | r-knippen@attglobal.net |
Connie Parkes | Dictaphone | Cornelia.Parkes@dictaphone.com |
Saliha Azzam | Microsoft Corporation | salihaa@microsoft.com |
Piek Vossen | Irion Technologies | - |
Remi Zajac | Systran Corporation | - |
For any further questions related to the workshop, please email Federica Busa, federica_busa@yahoo.com