LREC - Programme of the Conference

LREC - PROGRAM OF THE CONFERENCE
FIRST INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
28-30 May, 1998
GRANADA, SPAIN
First International Conference on Language Resources and Evaluation
May 28 - 30, 1998
Granada, Spain

MAY 28, 1998

10:00 - 11:30   OPENING SESSION

11:30 - 12:00   COFFEE BREAK

12:00 - 13:20   SESSIONS A, B, C, D IN PARALLEL

                SESSION A:  LINGUISTIC RESOURCES:  General Issues
12:00 - 12:20   Alan K. Melby, Data exchange standards from the OSCAR and MARTIF projects (Invited Talk)
12:20 - 12:40   Chris Makemson, The Use of Standard Language Resources in On-line Cultural Heritage Systems (Invited Talk)
12:40 - 13:00   Wim Peters, Hamish Cunningham, Yorick Wilks, A Level Playing Field for Language Resource Evaluation
13:00 - 13:20   Isabelle de Lamberterie, Language Resources and Legal Issues (Invited Talk)
                        
                SESSION B:  MACHINE TRANSLATION EVALUATION
12:00 - 12:20   John S. White and Kathryn B. Taylor, A Task-Oriented Evaluation Metric for Machine Translation
12:20 - 12:40   Eduard Hovy, Creating Useful Evaluation Metrics for Machine Translation
12:40 - 13:00   Claus Povlsen, Nancy L. Underwood, Bradley Music, Anne Neville Evaluating text-types suitability for Machine Translation: 
                a case study on an English-Danish MT system 
                
                SESSION C: PANEL ON THE NEED FOR MAINTENANCE OF LANGUAGE RESOURCES
12:00 - 13:20   Chair:  Catherine Macleod (NYU, COMLEX Syntax, NOMLEX), A Plea for Consideration of Maintenance of Language Resources
Panelists:  Lou Burnard (Oxford University), Khalid Choukri  (ELRA), George Doddington (SRI), Nancy Ide  (Vassar, TEI, CES), John McNaught (UMIST, EAGLES), Antoine Ogonowski (ERLI, PAROLE - SIMPLE), Richard Piepenbrock (Max Plank, Celex), Hozumi Tanaka (Tokyo Institute of Technology)
        
                SESSION D:  SPOKEN LANGUAGE DIALOGUE EVALUATION (1)
12:00 - 12:20   Joseph Polifroni, Stephanie Seneff, James Glass, Christine Pao, Edward Hurley, Philipp Schmid, 
                Helen Meng, Lee Hetherington, Victor Zue,  Evaluation Methodology for a Telephone-Based Conversational System
12:20 - 12:40   Els den Os, Gerrit Bloothooft, Analysis of the Elsnet Olympics Test of Spoken Dialogue Systems
12:40 - 13:00   Lopez-Cozar R., Rubio A.J., Garcia P., Segura J.C., A Spoken Dialogue System based on Dialogue Corpus Analysis
13:00 - 13:20   Antoine J-Y., Zeiliger J., Caelen J., DQR test suites for a qualitative evaluation of spoken dialogue systems: from speech 
        understanding to dialogue strategy


13:20 - 14:40   LUNCH

14:40 - 16:40   SESSIONS E, F, G, H IN PARALLEL

                SESSION E:  LEXICAL ACQUISITION
14:40 - 15:00   Ulrich Heid, Building of a dictionary of German support verb constructions from text corpora
15:00 - 15:20   Stefano Federici, Simonetta Montemagni, Vito Pirrelli, Nicoletta Calzolari, Acquiring NLP lexica from running texts: 
                the SPARKLE's approach
15:20 - 15:40   Alessandro Cucchiarelli, Danilo Luzi, Paola Velardi, Using Corpus Evidence for Automatic Gazetteer Extension
15:40 - 16:00   Judith Eckle-Kohler, Jonas Kuhn, Christian Rohrer,  Lexicon Acquisition with and for symbolic NLP-systems - 
                a bootstrapping approach
16:00 - 16:20   Evelyne Viegas, Arnim Aruelas, Sergei Nirenburg,  Extending a Core Lexicon Using On-line Language resources with Savoir-Faire
16:20 - 16:40   Sandro Pedrazzini & Marcus Hoffmann,  From Lexical Acquisition to Lexical Reusable Tools

                SESSION F:  EVALUATION IN NLP
14:40 - 15:00   Charles L. Wayne,  A Case Study in Corpus Creation & Evaluation Methodologies
15:00 - 15:20   Lynette Hirschman,  Language Understanding Evaluations: Lessons Learned from MUC and ATIS
15:20 - 15:40   Joseph Mariani , The Aupelf-Uref Evaluation-Based Language Engineering Actions and Related Projects
15:40 - 16:00   Nancy L. Underwood,  Issues in Designing a flexible validation methodology for NLP lexica
16:00 - 16:20   Paul Baker, Lou Burnard, Tony McEnery, Andrew Wilson,  Techniques for Evaluation of Language Corpora: a report from the                         front
16:20 - 16:40   R. Gaizauskas, M. Hepple, C. Huyck, A Scheme for Comparative Evaluation of Diverse Parsing Systems
                
                SESSION G:  LANGUAGE  RESOURCES: POLICY ISSUES
14:40 - 15:00   Simon Bensasson, Future Emerging Technologies - current thinking for FP5  (Invited Talk)
15:00 - 15:20   Dimitrios Theologitis, Linguistic Resources at the European Commission Translation Service  (Invited Talk)      
15:20 - 15:40   Poul Andersen, Language Engineering and Multi-lingual Issues - Cooperation with Central & Eastern Europe  (Invited Talk)
15:40 - 16:00   Tarcisio Della Senta, UNL: A New Electronic Language For The Internet  (Invited Talk)
16:00 - 16:20   Khalid Choukri, ELRA:  From Infrastructure to Market Demands  (Invited Talk)
16:20 - 16:40   Mark Liberman and Christopher Cieri, The Creation, Distribution and Use of Linguistic Data:
                the case of the Linguistic Data Consortium  (Invited Talk)
                
                SESSION H:  SPOKEN LANGUAGE DIALOGUE EVALUATION (2)
14:40 - 15:00   Lin L. Chase, Evaluating Word Confidence Annotation for Speech Recognition Systems
15:00 - 15:20   D. Aiello, L. Cerrato, C. Delogu, A. Di Carlo,  Definition and evaluation of a speech translation prototype for limited domain          tasks
15:20 - 15:40   Ludwig Hitzenberger,  Man Machine Interaction in Car Information Systems
15:40 - 16:00   Laila Dybkjaer and Niels Ole Bernsen, The DISC Approach to Development and Evaluation
16:00 - 16:20   A.G.G. Bouwman, J. Hulstijn, Dialogue Strategy (Re-)Design with Reliability Measures
16:20 - 16:40   Wolfgang Minker, Evaluation Methodologies for Interactive Speech Systems


16:40 - 17:00   COFFEE BREAK

17:00 - 18:20   PANEL OF THE FUNDING AGENCIES
CHAIR:  Antonio Zampolli (ILC)
PANELISTS:  Roberto Cencioni (EC), Ron Larsen (ARPA), Gary Strong (NSF)
DISCUSSANTS:  Nuria Bel (FBG), Ralph Grishman (NYU), Nancy Ide (Vassar College), Joseph Mariani (LIMSI), Nick Ostler (Linguacubun)

18:20 - 19:30   PANEL ON COOPERATION BETWEEN EU AND OTHER COUNTRIES IN THE FIELD OF LANGUAGE RESOURCES AND EVALUATION
CHAIR:  Mr. Alain Servantie (DG XIII-INCO)
PANELISTS: Eva Hajicova (Charles University, Prague), Dan Tufis (Romanian Academy), Klara Vicsi (Technical Univ. of Budapest), Zygmunt Vetulani (Adam Mickiewicz University, Poznan), Mohamed Chad (University of Fez), Salem Ghazali (IRSIT, Tunis),  Daniel Martin Mayorga (Telefónica Argentina)

20:00   Welcome reception. Capilla Colegio Máximo de Cartuja. Universidad de Granada and Real Academia de Ciencias Exactas, Físicas y Naturales.

22:30   Visit to the Alhambra. Consejería de Cultura de la Junta de Andalucía and Patronato de la Alhambra y Generalife.

MAY 29, 1998

9:00 - 9:40     2 KEYNOTE SPEAKERS IN PARALLEL

* Nicoletta Calzolari and Harald Höge
Spoken & Written Language Resources in Europe:
Spoken Language Resources for Voice Driven Man Machine Interfaces, H. Höge
An Overview on Written Language Resources in Europe: a few Reflections, Facts, and a Vision, N. Calzolari

* Margaret King and Bente Maegaard
 Issues in Natural Language Systems Evaluation, M. King and B. Maegaard


9:40 - 10:40    SESSIONS I, J, K, L IN PARALLEL

                SESSION I:  LEXICAL PROJECTS (1)
  9:40 - 10:00  Dan Tufis, Nancy Ide, Tomaz Erjavec,  Standardised Specifications, development and Assessment of 
        Large Morpho-Lexical Resources for Six Central and Eastern European Languages
10:00 - 10:20   Nilda Ruimy, Ornella Corazzari , Elisabetta Gola , Antonietta Spanu, Nicoletta Calzolari , Antonio Zampolli, 
        European LE-PAROLE Project: The Italian Syntactic Lexicon
10:20 - 10:40   Anna Braasch, Anni Buhr Christensen, Sussi Olsen, Bolette S. Pedersen.,  A Large scale lexicon for Danish in the Information Society
                
                SESSION J:  EVALUATION OF TOOLS & TOOLS FOR EVALUATION IN NLP
  9:40 - 10:00  Patrizia Paggio and Bradley Music,  Evaluation In SCARRIE
10:00 - 10:20   Emmanuelle Rodier, Semi Automatic Generation of Reference Diagnostics within an Evaluation Tool for Simplified 
        English Checkers
10:20 - 10:40   Langlais Ph., Simard M., Theron P., Bonhomme P., Souissi E., Isabelle P., Armstrong S., Debili F.,
        Veronis J.,   The ARC-A2 A Cooperative Research Project on Bilingual Text Alignment 

                SESSION K: SPEECH PROCESSING AND EVALUATION
  9:40 - 10:00  George Zavaliagkos,  Utilizing untranscribed training data to improve performance  (Invited Talk)
10:00 - 10:20   Lynnette Hirshman, Reading Comprehension: A Grand Challenge for Language Understanding  (Invited Talk)
10:20 - 10:40   David S. Pallett, The NIST role in automatic speech recognition benchmark texts  (Invited Talk)
                
                SESSION L:  SPOKEN LANGUAGE RESOURCES PROJECTS (1)
  9:40 - 10:00  Florian Schiel, Speech And Speech-Related Resources at BAS
10:00 - 10:20   J.C. Roux, Saspeech: Establishing Speech Resources for the Indigenous Languages of South Africa
10:20 - 10:40   Shuichi Itahashi,  On Speech and Text Database Activities in Japan

10:40 - 11:00   COFFEE BREAK

11:00 - 12:00   SESSIONS I, J, K, L CONTINUED

                SESSION I:  LEXICAL PROJECTS (1)
11:00 - 11:20   Svetlana Sheremetyeva, Jim Cowie, Sergei Nirenburg and Remi Zajac,  Multilingual Onomasticon as a Multipurpose NLP              Resource
11:20 - 11:40   Rémi Zajac,  The Habanera Lexical Knowledge Management System
11:40 - 12:00   Masumi Narita, Language Resources for "Writer Helper»
                
                SESSION J:  EVALUATION OF TOOLS & TOOLS FOR EVALUATION IN NLP
11:00 - 11:20   Thierry Declerck and Judith Klein,  Evaluation of the NLP Components of an Information Extraction System for German
11:20 - 11:40   Didier Bourigault, Benoit Habert,  Evaluating Terminology Extraction Systems: theoretical backgrounds and an experiment
11:40 - 12:00   Yves Simon, Chantal Enguehard, Jean Francois Hue, COMET a system to fill the lexical gaps by means of metaphor
                
                SESSION K: SPEECH PROCESSING AND EVALUATION
11:00 - 11:20   Mark A. Przybocki, Alvin Martin,  NIST Speaker recognition evaluations
11:20 - 11:40   Steven Wegmann, Dragon Systems' Automatic Transcription System for the New TDT Corpus
11:40 - 12:00   Ron Larsen, to be announced  (Invited Talk)

                SESSION L:  SPOKEN LANGUAGE RESOURCES PROJECTS (1)
11:00 - 11:20   Christoph Draxler, Henk van den Heuvel, Herbert S.Tropf, SpeechDat Experiences in creating Large 
                Multilingual Speech Databases for Teleservices
11:20 - 11:40   Asuncion Moreno, Harald Hoege, Joachim Koehler, Jose B.Marino, SpeechDat Across Latin America
11:40 - 12:00   P. Roach, S. Arnfield, W. Barry, S. Dimitrova, M. Boldea, A. Fourcin, W. Gonet, R. Gubrynowicz, 
                E. Hallum, L. Lamel, K. Marasek, A. Marchal, E. Meister, K. Vicsi,  Babel: A Database of central and eastern european languages


12:00 - 13:20   POSTER SESSIONS 1, 2, 3, 4 IN PARALLEL

POSTER SESSION 1: LEXICON

Elena Paskaleva, The Lexical Resources of highly inflected Slavonic languages in European standards and implementation formats
Elena Barcena, Tim Read & Ricardo Mairal, Building a lexical reference system: something old, something new, something borrowed, something blue
Aduriz I., Aldezabal I., Ansa O., Artola X., Diaz de Ilarraza A., Insausti J.M., EDBL: a multi-purposed lexical support for treatment of Basque
Maite Melero, Marta Villegas, Issues on the Syntactic Encoding of a Computational Lexicon
Ricarda Dormeyer, Ingrid Fischer, Building Lexicons out of a Database for Idioms
Jesus-Luis Cunchillos Ilarri and Raquel Cervignon Moreno, Analizador Morfologico Ugaritico (AMU)
Toni Tuells, Constructing and updating the lexicon of a two-level morphological analyzer from a Machine-Readable Dictionary
Uwe Quasthoff, Tools for Automatic Lexicon Maintenance: Acquisition, Error Correction, and the Generation of Missing Values
Pedro L. Diez-Orzas, Antonietta Alonge, Exploiting Data from EuroWordNet database for Industrial Application
Eugenio Picchi, Exploiting Language Resources and Linguistic Tools for Multilingual Information Retrieval: The EUROSEARCH Approach
Alessandro Artale, Anna Goy, Bernardo Magnini, Emanuele Pianta, Carlo Strapparava, Issues in The Development Cycle of the Italian Version of WordNet
Ana Garcia-Serrano and Jesus Contreras, A Computational Platform for Ugaritic Morphological Analysis
Martin Hoelter, Rolf Wilkens, CCSD-Online - An English online-dictionary with multimedia extensions

POSTER SESSION 2: EVALUATION IN NLP (1)

Natalia Brines-Moya, Julie Hartill, Criteria for user-oriented evaluation of monolingual text corpora interfaces
Judith L. Klavans, Kathleen McKeown, Min-Yen Kan, Susan Lee, Resources for evaluation of summarization techniques
Tibor Kiss, Daniela Steinbrecher, Lexical Replacement in Test Suites for the Evaluation of Natural Language Applications
N. Belmore, Automated Procedures for evaluating tagging
Josep Carmona, Sergi Cervell, M.Antonia Marti, Lluis Marquez, Lluis Padro, Roberto Placer, Horacio Rodriguez, Mariona Taule, Jordi Turmo, An Environment for Morphosyntactic Processing of Unrestricted Spanish text
Guido Boella and Leonardo Lesmo, Automatic Refinement of Linguistic Rules for Tagging
Jan Hajic, Barbora Hladka, Czech Language Processing / POS Tagging
Vilson J. Leffa, Clause Processing in Complex Sentences
Bernd Geistert, Manuela Boros, Ute Ehrlich, Administration of large grammar resources - Design and implementation of a 'Grammar Pool' 
Janusz S. Bien, Evaluating analysers of Polish

POSTER SESSION 3: NL CORPUS

Toni Badia, Manel Pujol, Antoni Tuells, Jordi Vivaldi, Lluís de Yzaguirre, Teresa Cabré, IULA's LSP Multilingual Corpus: compilation and processing
J.G. Kruyt, Dutch written language resources, their users and uses
Charles Fillmore, Nancy Ide, Dan Jurafsky, Catherine Macleod, An American National Corpus: A Proposal
Tomaz Erjavec, Nancy Ide, The MULTEXT-East Corpus
Susan Armstrong, Masja Kempen, David McKelvie, Dominique Petitpierre, Reinhard Rapp, Henry S. Thompson, Multilingual Corpora for Cooperation
Luisa Alice Santos Pereira, Corpus De Referencia do Portugues Contemporaneo
Tomaz Erjavec, Ann Lawson, Laurent Romary, East meets West: Producing Multilingual Resources in a European Context
Marie-Paule Pery-Woodley, Josette Rebeyrolle, Domain and Genre in sublanguage text: definitional microtexts in three corpora
Anne Abeillé, Lionel Clément, Rodrigo Reyès, TALANA Annotated Corpus: the first results
Giacomo Ferrari, Preliminary steps towards the creation of a Discourse and Text Resource
Joseba Abaitua, Arantza Casillas, Raquel Martinez, Value added Tagging for Multilingual resources management
Jesus-Luis Cunchillos Ilarri y Joaquin Siabra, Herramienta para el Tratamiento critico de textos. Desarrollo del modulo basico (modulo-1)
Teruo Koyama, Masaharu Yoshioka, Kyo Kageura, The Construction of a Lexically Motivated Corpus - The Problem of Defining Proper Lexical Unit
David Day, John Aberdeen, Sasha Caskey, Lynette Hirschman, Patricia Robinson and Marc Vilain, Alembic Workbench Corpus Development Tool

POSTER SESSION 4:  SPEECH DATABASES AND PHONETIC LEXICA

Ryszard Gubrynowicz, The Polish database of spoken language
Ute Ziegenhain, Steffen Harengel, Janez Kaiser, Ralph Wilhelm, Creating Large Pronunciation Lexica for Speech Applications
Maria Fernanda Bacelar do Nascimento, Portugues falado, variedades geograficas e sociais - Program LINGUA/SOCRATES
Robert Neumann, The Historical Yiddish Language Resource: The Archives of the Language and Culture Atlas of Ashkenazic Jewry
Susanne Burger, Christoph Draxler, Identifying Dialects of German from Digit Strings
Stefan Grocholewski, First Database for Spoken Polish
Tatiana Y. Sherstinova, Speech Evaluation in Russian Phonetic Database
Jong-mi Kim, Stephen A. Dyer, Dwight Day, Construction of a Speech Translation database
Anja Elsner, Thomas Portele, Monika Rauth, Gerit Sonntag, Maria Wolters, Constructing a prosodic database for American English
Susanne Burger, Florian Schiel, RVG 1 - A Database for Regional Variants of Contemporary Spoken German
Simon Dobrisek, Jerneja Gros, France Mihelic, Nikola Pavesic, Recording and Labelling of the GOPOLIS Slovenian Speech Database
Hartmut R. Pfitzinger, The Collection of spoken language resources in car environments
Javier Ortega Garcia, Joaquin Gonzalez-Rodriguez, Victoria Marrero-Aguiar, Juan J.Diaz-Gomez, Ramon Garcia-Jimenez, Jose Lucena Molina, Jose A.G. Sanchez-Molero, Speaker recognition-oriented 'Ahumada' large speech corpus
D.Langmann, T.Schneider, R.Grudszus, A.Fischer, T.Crull, CSDC - The MoTiV Car-Speech Data Collection
L.F. Lamel, G. Adda, M. Adda-Decker, C. Corredor, J.J. Gangolf, J.L. Gauvain, A Multilingual Corpus for Language Identification
Vera Semanova-Fluhr, Language Systems and Resources in Russia
Martine de Calmes and Guy Perennou, BDLex: a lexicon for Spoken & Written French

13:20 - 14:40   LUNCH

14:40 - 16:40   SESSIONS M, N, O, P, Q IN PARALLEL

                SESSION M:  LEXICAL PROJECTS (2): SEMANTIC  NETS
14:40 - 15:00   Adriana Roventini, Nicoletta Calzolari, Carol Peters,  Building a Semantic Network for Italian using Existing Lexical resources
15:00 - 15:20   Antonietta Alonge, Data on Verb Semantics in the EuroWordNet Database
15:20 - 15:40   Bonnie Dorr, M. Antonia Marti and Irene Castellon,  Evaluation of LCS- and EuroWordNet-Based Lexical Resources for 
                Machine Translation
15:40 - 16:00   Piek Vossen and Laura Bloksma, Categories and classifications in EuroWordNet
16:00 - 16:20   Wim Peters, Piek Vossen,  The Reduction of Semantic Ambiguity in Linguistic Resources
16:20 - 16:40   Charles J. Fillmore, Beryl T. S. Atkins,  FrameNet and Lexicographic Reference
                
                SESSION N:  EVALUATION: TOKENIZERS, TAGGERS, PARSERS
14:40 - 15:00   B. Habert, G. Adda, M. Adda-Decker, P. Boula de Mareuil, S. Ferrari, O. Ferret, G. Illouz, 
                P. Paroubek,  The Need for Tokenization evaluation
15:00 - 15:20   Patrick Paroubek, Gilles Adda, Joseph Mariani, Josette Lecomte, Martin Rajman,  The GRACE French Part-Of-Speech 
                Tagging Evaluation Task
15:20 - 15:40   Josette Lecomte, Nadine Lucas, Martin Rajman,  Linguistic Issues in GRACE (evaluation of Part-Of-Speech tagging for French)
15:40 - 16:00   Marc Bertier, Genevieve Lallich-Boidin,  A Paradox Raised by the Evaluation of Taggers
16:00 - 16:20   Martin Wynne, Roger Garside, Geoffrey Leech, Andrew Wilson,  Parallel Wordclass Tagging
16:20 - 16:40   John Carroll, Ted Briscoe, Antonio Sanfilippo,  Parser Evaluation: a Survey and a New Proposal
                
                SESSION O:  NL CORPUS PROJECTS
14:40 - 15:00   Koichi Hashida, Hitoshi Isahara, Takenobu Tokunaga, Minako Hashimoto, Shiho Ogino, 
                Wakako Kashino,  RWC text database
15:00 - 15:20   Nancy Ide,  Corpus Encoding Standard: SGML guidelines for Encoding Linguistic Corpora
15:20 - 15:40   Hitoshi Isahara,  JEIDA's English-Japanese Bilingual Corpus Project
15:40 - 16:00   Diana Santos,  Providing access to language resources through the WorldWideWeb: the Oslo corpus of Bosnian Texts
16:00 - 16:20   Dan Cristea, Nancy Ide, Laurent Romary,  Marking-up multiple views on a text: discourse and reference
16:20 - 16:40   Michel Simard, The BAF: A Corpus of English-French Bitext
                
                SESSION P:  SPOKEN LANGUAGE RESOURCES PROJECTS (2)
14:40 - 15:00   Jesus E. Diaz, Antonio M. Peinado, Antonio J. Rubio, E. Segarra, N. Prieto, F. Casacuberta, Albayzin: a task-oriented 
                Spanish speech corpus
15:00 - 15:20   Federico Albano Leoni, Andrea Paoloni, Mario Refice, Rinaldo, Alberto Sobrero, CLIP Corpus della Lingua Italiana 
                Parlata (Corpus of Spoken Italian)
                
                SESSION Q:  LANGUAGE RESOURCES: STRATEGIC ISSUES
15:20 - 15:40   Ron Cole,  Language Resources for Everyone  (Invited Talk)
15:40 - 16:00   Gosse Bouma, Ineke Schuurman,  Intergovernmental language policy and the evaluation of resources, tools and end products for
                Dutch
16:00 - 16:20   David Brooks, Language Resources and International Product Strategy  (Invited Talk)
16:20 - 16:40   Giovanni Varile, Future Perspectives in Human Language Technology  (Invited Talk)

16:40 - 17:00   COFFEE BREAK

17:00 - 18:40   2 PANELS IN PARALLEL

* EAGLES PANEL ON LEXICAL SEMANTIC STANDARDS FOR INFORMATION SYSTEMS
CHAIR: Antonio Sanfilippo (Sharp)
PANELISTS: Nicoletta Calzolari (ILC), Patrick Saint-Dizier (IRIT), Piek Vossen (Amsterdam Univ.), Robert Gauzauskas (Sheffield Univ.), Sophia Anianadou (Manchester Metropolitan Univ.)
DISCUSSANTS: Eduard Hovy (USC), Ralph Grishman (NYU), Sergei Nirenburg (EMU), Lin Chase (LIMSI-CNRS)

* INDUSTRIAL AND R&D USE OF LANGUAGE RESOURCES
CHAIR: Khalid Choukri (ELRA)
PANELISTS:  D.  Brooks  (Microsoft, USA), J.P. Chanod (Xerox, France), C. Cirilli (Synthema, Italy), M. Hunt (Dragon, UK),  I.  Johnson (Sharp, UK), S. Kunzmann (IBM-Europe, Germany), N. Lenke  (PHILIPS, Germany), 
J. Odijk (Lernout & Hauspie Speech products, Belgium)

MAY 30, 1998

9:00 - 9:40     2 KEYNOTE SPEAKERS IN PARALLEL

* Donna Harman and Greg Grefenstette
The Text REtrieval Conferences (TRECs) and the Cross-Language Track, D. Harman
Problems and Techniques for Cross Language Information Retrieval, G. Grefenstette

* Christian Dugast and Lori Lamel 
Issues in Man-Machine Spoken Dialogues

9:40 - 10:40    SESSIONS R, S, T, U IN PARALLEL

                SESSION R:  ONTOLOGIES & KNOWLEDGE BASES
  9:40 - 10:00  Nicola Guarino,  Some Ontological Principles for the Design of Upper Level Lexical Resources
10:00 - 10:20   Eduard Hovy,  Combining and Standardizing Large-Scale, Practical Ontologies for Machine Translation and Other Uses
10:20 - 10:40   Philippe Alcouffe,  From Thematic index to semantic links: querying multimedia reference CD-ROMs as knowledge bases
                
                SESSION S:  EVALUATION IN NLP: TASKS & COMPONENTS
  9:40 - 10:00  Robert Dale, Chris Mellish,  Issues in Evaluating Natural language Generation
10:00 - 10:20   Amit Bagga,  Evaluation Of Coreferences and Coreference Resolution Systems
10:20 - 10:40   Andrei Popescu-Belis,  How Corpora with Annotated Coreference Links Improve Anaphora and Reference Resolution

                SESSION T:  TOOLS FOR NLP
  9:40 - 10:00  Dan Tufis, Oliver Mason,   Tagging Romanian Texts: A Case Study for QTAG, a Language Independent POS-Tagger
10:00 - 10:20   Ferran Pla, Natividad Prieto,  Using Grammatical Inference Methods for Automatic Part-of-Speech Tagging
10:20 - 10:40   Irene Castellon, Montse Civit, Jordi Atserias,  Syntactic Parsing of Unrestricted Spanish Text
                
                SESSION U:  SPOKEN LANGUAGE SYSTEMS EVALUATION
  9:40 - 10:00  Louis C.W. Pols, Jan P.H. van Santen, Masanobu Abe, Dan Kahn, Eric Keller,  The Use of large text corpora for evaluating 
                text-to-speech systems
10:00 - 10:20   P. Boula de Mareuil, F. Yvon, C. d'Alessandro, V. Auberg‚, M. Bagein G. Bailly, F. Bechet, S. Foukia,
                J.-P. Goldman, E. Keller, D. O'Shaughnessy, V. Pagel, F. Sannier, J. Veronis, B. Zellner,  Objective evaluation methodology
                of grapheme-to phoneme conversion for text-to-speech synthesis in French
10:20 - 10:40   Yann Morlec, Albert Rilliard, Gerard Bailly and Veronique Auberge,  Evaluating The adequacy of synthetic prosody in 
                signaling syntactic boundaries: methodology and first results

10:40 - 11:00   COFFEE BREAK

11:00 - 12:00   SESSIONS S, T, U CONTINUED

                SESSION S:  EVALUATION IN NLP: TASKS & COMPONENTS
11:00 - 11:20   K. Netter, S. Armstrong, T. Kiss, J. Klein, S. Lehmann, D. Milward, D. Petitpierre, S. Pulman, 
                S. Regnier-Prost, R. Schaler, H. Uszkoreit, T. Wegst,  DIET - Diagnostic and Evaluation Tools for Natural Language Applications
11:20 - 11:40   Adam Kilgarriff,  Gold Standard Resources for Evaluating Word Sense Disambiguation Programs
11:40 - 12:00   Seung Hyun Yang, Young-Sum Kim, A Quantitative Measure or Quality for Evaluating Sentences Based on Genetic Algorithm
                
                SESSION T:  TOOLS FOR NLP
11:00 - 11:20   I. Prodanof, A. Cappelli, L. Moretti, M. Carenini, P. Moreschini, M. Vanocchi,  A Grammar development environment 
                for reusable and easily customizable NL applications
11:20 - 11:40   Alberto Lavelli, Fabio Pianesi,   Developing Language Resources and Applications with Geppetto
11:40 - 12:00   Nabil Hathout & Fiammetta Namer , Automatic Construction and Validation of French Large Lexical Resources. 
                Reuse of Verb Theoretical Linguistic Descriptions

                SESSION U:  SPOKEN LANGUAGE SYSTEMS EVALUATION
11:00 - 11:20   Jerneja Gros, France Mihelic and Nikola Pavesic,   Speech Quality Evaluation in Slovenia TTS
11:20 - 11:40   Lise van Haaren, Marc Blasband, Marinel Gerritsen, Marcha van Schijndel,  Evaluating Quality of Speech Recognition Systems:
                Comparing a Technology-focused and a User-focused Approach


12:00 - 13:20   POSTER SESSIONS 5, 6, 7, 8  IN PARALLEL

POSTER SESSION 5: LEXICA, CORPUS, TERMINOLOGY

Catherine Macleod, Ralph Grishman, and Adam Meyers, Dictionaries and Balanced Corpora: The interdependence of resources
Alexandre Zoubov, Computer fund of Byelorussian
Serge A. Yablonsky, Russicon Russian Monolingual and Multilingual Language Resources, Software and Applications
Heiki-Jaan Kaalep, Rene Prillop, Epp Ehasalu, The Role of Internet in Creating, Financing and Integrating Language Resources
Henri Zingle, From Linguistic resources to applications with the Zstation
Christian Galinski, Terminology Infrastructures and the terminology market in Europe
G. Negrini, T. Farnesi, An On-line reference system to manage terminological resources
Luciana Bordoni, An Experience at ENEA for Building a Specialized Thesaurus
André Le Meur, GENETER : un format générique pour la diffusion et la réutilisation de données terminologiques hétérogènes
Alessandra Fazio, Francesca Peruzzi, Development of a tool for the organization of sports terminology
Sharon Denness, Evaluating Terminology for technical authoring
Mustafa-Elhadi Widad & Jouis Christophe, Terminology Extraction and acquisition from textual data: criteria for evaluating tools and methods
Archibald Michiels & Nicolas Dufour, DEFI: a Tool for Automatic Multi-Word Unit Recognition, Meaning Assignment and Translation Selection

POSTER SESSION 6: EVALUATION IN NLP(2)

Luca Dini, Vittorio Di Tomaso, Frederique Segond, Rule based semantic tagging
Alberto Diaz Esteban, Manuel de Buenaga Rodriguez, L.Alfonso Urena Lopez, Manuel Garcia Vega, Integrating linguistic resources in a uniform way for text classification tasks
Eleni Efthimiou and Christina Alexandri, On The Treatment of Extra-linguistic Knowledge in Grammar Resources
Bruno Landi, Patrick Kremer, Laurent Schmitt, AMARYLLIS: an evaluation experiment on search engine in a French-speaking context
Hanmin Jung, Sanghwa Yuh, Chul-Min Sim, Taewan Kim, Dong-In Park, Domain Identifier for the Use of Balanced Web Documents
Bernhard Staudinger and Nancy Smith, Some Problems in the Evaluation of the Russian-German Machine Translation System MIROSLAV
Davide Turcato, Fred Popowich, Olivier Laurens, Paul McFetridge, John Grayson, Re-use of linguistic resources in MT
Chadia Moghrabi, Christian Boitet, Using GETA's MT/NLP Tools as Expert Module in an intelligent Tutoring System for French
André Jean-Marc Loechel, Laura Garcia Vitoria, De l'utilisation de l'internet a la realisation de pages web comme support principal de didactique des languages
Reinhard Schaler, Localisation is good for you. The localisation resources Centre as an example for the successful co-operation between industrial users and academic researchers

POSTER SESSION 7: APPLICATIONS

Florence Reeder, Trainer Beware: Corpora for Language/Encoding Identification
Joerg Schuetz and Rita Nuebel, Multi-Purpose vs. Specific Application: Diagnostic Evaluation of Multilingual Language Technologies
Eugenio Picchi, CiBIT: Biblioteca Telematica Italiana. A Digital Library for the Italian Cultural Heritage 
Claude de Loupy, Marc El-Beze, Pierre-Francois Marteau, Word Sense Disambiguation using HMM Tagger
Amit Bagga, Evaluation Of Coreferences and Coreference Resolution Systems
Laurent Fischer, An Approach to linguistic knowledge discovery assistants
Josep Carmona, Sergi Cervell, M.Antonia Marti, Lluis Marquez, Lluis Padro, Roberto Placer, Horacio Rodriguez, Mariona Taule, Jordi Turmo, An Environment for Morphosyntactic Processing of Unrestricted Spanish text

POSTER SESSION 8:  TOOLS & FORMAT FOR SPOKEN LRs

Ivan Kopecek, Automatic Segmentation into Syllable Segments
Geoffrey Sampson, Consistent Annotation of Speech-Repair Structures
Pavel Fryda, Ivan Kopecek, PHC Format for Managing Data in Phonetic Corpora
Maxine Eskenazi, Robert Frederking, Issues in Database design: recording and processing speech from new populations
Florian Schiel, Susanne Burger, Anja Geumann, Karl Weilhammer, The Partitur Format at BAS
J. Bruce Millar, A Structure for Comprehensive Spoken Language Description
Nick Campbell, Design of Speech Corpora for use in Concatenative Synthesis Systems
Christoph Draxler, WWWSigTranscribe - AN EXTENSION OF THE WWWTranscribe TOOLBOX
Klare Vicsi and Attila Vig, Language Independent Automatic Segmentation Technique using Sampa Labelling of Phonemes
José A.R. Fonollosa, Asuncion Moreno, Automatic Database Acquisition Software for ISDN PC Cards and Analogic Boards
Odile Mella and Dominique Fohr, Two Tools for semi-automatic phonetic labelling of large corpora
Youngkil Kim, Sanghwa Yuh, Hanmin Jung, Taewan Kim, Dong-In Park, Automatic Extraction of Illocutionary Forces' Types in a Dialogue, Based on Context and Modal Information
Eva Knodt, Jared Bernstein, Ognien Todic, A Protocol for Collecting a Corpus of Spontaneous, Conversational, Hispanic English
Rolf Wilkens, Martin Hoelter, EYDES Transcription Workbench - Bidirectional transcription of Yiddish spoken text
Karl Weilhammer, Susanne Burger, Characterizing a Database of spoken German by techniques of data mining
Albino Nogueiras Rodriguez and Asuncion Moreno Bilbao, NaniBd: a Set of Tools for Transcribing and Validating Speech Databases
Brian MacWhinney and Steven Gillis, The CHILDES System
Claude Barras, Edouard Geoffrois, Zhibiao Wu, Mark Liberman, Transcriber: a Free Tool for Segmenting, Labeling and Transcribing Speech

13:20 - 14:40   LUNCH

14:40 - 16:40   SESSIONS W, X, Y, Z IN PARALLEL

                SESSION W:  TERMINOLOGY
14:40 - 15:00   B. Habert, A. Nazarenko, P. Zweigenbaum and J. Bouaud,  Extending an existing specialized lexicon
15:00 - 15:20   Beatrice Daille, Christian Jacquemin,  Lexical Database and information access: a fruitful association?
15:20 - 15:40   Thierry Hamon, Adeline Nazarenko,  Using General semantic information to help the terminology structuration
15:40 - 16:00   Diana Maynard and Sophia Ananiadou,  Term Sense Disambiguation Using a Domain-Specific Thesaurus
16:00 - 16:20   Rochdi Oueslati,  A Corpus-based method for linguistic knowledge extraction and its evaluation
16:20 - 16:40   Ute Ehrlich,  Automatic Extraction of a Unique Terminology Based on Multilingual Corpus and Dictionary
                
                SESSION X:  TREEBANKS
14:40 - 15:00   Petra Maier-Meyer, Juergen Oesterle,  The GNoP (German Noun Phrase) Treebank
15:00 - 15:20   Brigitte Krenn, Wojciech Skut, Thorsten Brants,  Construction of a Linguistically Interpreted German Newspaper Corpus
15:20 - 15:40   Eva Hajicova, Jarmila Panevova,  Language Resources Need Annotations to Make them Really Reusable: The Prague Tree Bank
15:40 - 16:00   Sadao Kuroashi and Makoto Nagao,  Building a Japanese Parsed Corpus while Improving the Parsing System
16:00 - 16:20   Hsin-Hsi Chen and Min-Shin Shaw,  A Treebank Development Tool
16:20 - 16:40   Akira Ichikawa, Masahiro Araki, Yasuo Horiuchi, Masato Ishizaki, Shuichi Itabashi, Toshihiko Ito,
                Hideki Kashioka, Keiji Kato, Hideaki Kikuchi, Hanae Koiso, Tomoko Kumagai, Akira Kurematsu,
                Kikuo Maekawa, Katsuhiro Murakami, Shu Nakazato, Yoichi Yamashita, Masafumi Tamoto, Syun
                Tutiya, Takashi Yoshimura,  Standardising Annotation Schemes for Japanese Discourse
                
                SESSION Y:  MULTILINGUAL ISSUES
14:40 - 15:00   Sergei Nirenburg,  Project Boas: "A Linguist in the Box» as a Multi-Purpose Language Resource
15:00 - 15:20   Jeff Allen, Christopher Hogan,   Expanding lexical coverage of parallel corpora for the EBMT approach
15:20 - 15:40   Gregory Grefenstette,  Evaluating The Adequacy of a Multilingual Transfer Dictionary for the Cross Language Information 
                Retrieval Task
15:40 - 16:00   Bonnie Dorr, Douglas W. Oard,  Evaluating Resources for Query Translation in Cross-Language Information Retrieval
16:00 - 16:20   Ingeborg Blank,   Computer-aided analysis of multilingual patent texts
16:20 - 16:40   Thomas Schneider,  Multilingual Information Processing: the AVENTINUS Project
                
                SESSION Z:  SPOKEN LANGUAGE SYSTEMS AND EVALUATION
14:40 - 15:00   J.M. Dolmazon, F. Bimbot, G. Adda, M. El-Beze, J-C. Caerou, J. Zeiliger, M. Adda-Decker,  An Overview of the first 
                evaluation campaign for speech dictation systems in French
15:00 - 15:20   M. Adda-Decker, G. Adda, L. Lamel, J.-L. Gauvain,  On evaluating speech and text corpora for French speech recognition
15:20 - 15:40   Lin L. Chase,  A Review of the American Switchboard and Callhome Speech Recognition Evaluation Programs
15:40 - 16:00   P. Garcia, A.J. Rubio, J. Diaz-Verdejo, M.C. Benitez, J.M. Lopez-Soler,  On The Comparison of Speech recognition Tasks
16:00 - 16:20   M. Jardino, F. Bimbot, S. Igounet, K. Smaili, I. Zitouni, M. El-Beze,  A First Evaluation Campaign for Language Models
16:20 - 16:40   Lucian Galescu, Eric Ringger, James Allen,  Rapid Language Model Development for New Task Domains


16:40 - 17:00   COFFEE BREAK

17:00 - 18:30   CLOSING SESSION 

SUMMARY & OUTCOME

21:00 Social dinner. Carmen de los Mártires (Coro Rociero Arrayanes)

MAY 31, 1998

Optional visit to Montefrío. Ayuntamiento de Montefrío and Cooperativa Olivarera San Francisco Asís