28-30 May, 1998
First International Conference on Language Resources and Evaluation May 28 - 30, 1998 Granada, Spain MAY 28, 1998 10:00 - 11:30 OPENING SESSION 11:30 - 12:00 COFFEE BREAK 12:00 - 13:20 SESSIONS A, B, C, D IN PARALLEL SESSION A: LINGUISTIC RESOURCES: General Issues 12:00 - 12:20 Alan K. Melby, Data exchange standards from the OSCAR and MARTIF projects (Invited Talk) 12:20 - 12:40 Chris Makemson, The Use of Standard Language Resources in On-line Cultural Heritage Systems (Invited Talk) 12:40 - 13:00 Wim Peters, Hamish Cunningham, Yorick Wilks, A Level Playing Field for Language Resource Evaluation 13:00 - 13:20 Isabelle de Lamberterie, Language Resources and Legal Issues (Invited Talk) SESSION B: MACHINE TRANSLATION EVALUATION 12:00 - 12:20 John S. White and Kathryn B. Taylor, A Task-Oriented Evaluation Metric for Machine Translation 12:20 - 12:40 Eduard Hovy, Creating Useful Evaluation Metrics for Machine Translation 12:40 - 13:00 Claus Povlsen, Nancy L. Underwood, Bradley Music, Anne Neville Evaluating text-types suitability for Machine Translation: a case study on an English-Danish MT system SESSION C: PANEL ON THE NEED FOR MAINTENANCE OF LANGUAGE RESOURCES 12:00 - 13:20 Chair: Catherine Macleod (NYU, COMLEX Syntax, NOMLEX), A Plea for Consideration of Maintenance of Language Resources Panelists: Lou Burnard (Oxford University), Khalid Choukri (ELRA), George Doddington (SRI), Nancy Ide (Vassar, TEI, CES), John McNaught (UMIST, EAGLES), Antoine Ogonowski (ERLI, PAROLE - SIMPLE), Richard Piepenbrock (Max Plank, Celex), Hozumi Tanaka (Tokyo Institute of Technology) SESSION D: SPOKEN LANGUAGE DIALOGUE EVALUATION (1) 12:00 - 12:20 Joseph Polifroni, Stephanie Seneff, James Glass, Christine Pao, Edward Hurley, Philipp Schmid, Helen Meng, Lee Hetherington, Victor Zue, Evaluation Methodology for a Telephone-Based Conversational System 12:20 - 12:40 Els den Os, Gerrit Bloothooft, Analysis of the Elsnet Olympics Test of Spoken Dialogue Systems 12:40 - 13:00 Lopez-Cozar R., Rubio A.J., Garcia P., Segura J.C., A Spoken Dialogue System based on Dialogue Corpus Analysis 13:00 - 13:20 Antoine J-Y., Zeiliger J., Caelen J., DQR test suites for a qualitative evaluation of spoken dialogue systems: from speech understanding to dialogue strategy 13:20 - 14:40 LUNCH 14:40 - 16:40 SESSIONS E, F, G, H IN PARALLEL SESSION E: LEXICAL ACQUISITION 14:40 - 15:00 Ulrich Heid, Building of a dictionary of German support verb constructions from text corpora 15:00 - 15:20 Stefano Federici, Simonetta Montemagni, Vito Pirrelli, Nicoletta Calzolari, Acquiring NLP lexica from running texts: the SPARKLE's approach 15:20 - 15:40 Alessandro Cucchiarelli, Danilo Luzi, Paola Velardi, Using Corpus Evidence for Automatic Gazetteer Extension 15:40 - 16:00 Judith Eckle-Kohler, Jonas Kuhn, Christian Rohrer, Lexicon Acquisition with and for symbolic NLP-systems - a bootstrapping approach 16:00 - 16:20 Evelyne Viegas, Arnim Aruelas, Sergei Nirenburg, Extending a Core Lexicon Using On-line Language resources with Savoir-Faire 16:20 - 16:40 Sandro Pedrazzini & Marcus Hoffmann, From Lexical Acquisition to Lexical Reusable Tools SESSION F: EVALUATION IN NLP 14:40 - 15:00 Charles L. Wayne, A Case Study in Corpus Creation & Evaluation Methodologies 15:00 - 15:20 Lynette Hirschman, Language Understanding Evaluations: Lessons Learned from MUC and ATIS 15:20 - 15:40 Joseph Mariani , The Aupelf-Uref Evaluation-Based Language Engineering Actions and Related Projects 15:40 - 16:00 Nancy L. Underwood, Issues in Designing a flexible validation methodology for NLP lexica 16:00 - 16:20 Paul Baker, Lou Burnard, Tony McEnery, Andrew Wilson, Techniques for Evaluation of Language Corpora: a report from the front 16:20 - 16:40 R. Gaizauskas, M. Hepple, C. Huyck, A Scheme for Comparative Evaluation of Diverse Parsing Systems SESSION G: LANGUAGE RESOURCES: POLICY ISSUES 14:40 - 15:00 Simon Bensasson, Future Emerging Technologies - current thinking for FP5 (Invited Talk) 15:00 - 15:20 Dimitrios Theologitis, Linguistic Resources at the European Commission Translation Service (Invited Talk) 15:20 - 15:40 Poul Andersen, Language Engineering and Multi-lingual Issues - Cooperation with Central & Eastern Europe (Invited Talk) 15:40 - 16:00 Tarcisio Della Senta, UNL: A New Electronic Language For The Internet (Invited Talk) 16:00 - 16:20 Khalid Choukri, ELRA: From Infrastructure to Market Demands (Invited Talk) 16:20 - 16:40 Mark Liberman and Christopher Cieri, The Creation, Distribution and Use of Linguistic Data: the case of the Linguistic Data Consortium (Invited Talk) SESSION H: SPOKEN LANGUAGE DIALOGUE EVALUATION (2) 14:40 - 15:00 Lin L. Chase, Evaluating Word Confidence Annotation for Speech Recognition Systems 15:00 - 15:20 D. Aiello, L. Cerrato, C. Delogu, A. Di Carlo, Definition and evaluation of a speech translation prototype for limited domain tasks 15:20 - 15:40 Ludwig Hitzenberger, Man Machine Interaction in Car Information Systems 15:40 - 16:00 Laila Dybkjaer and Niels Ole Bernsen, The DISC Approach to Development and Evaluation 16:00 - 16:20 A.G.G. Bouwman, J. Hulstijn, Dialogue Strategy (Re-)Design with Reliability Measures 16:20 - 16:40 Wolfgang Minker, Evaluation Methodologies for Interactive Speech Systems 16:40 - 17:00 COFFEE BREAK 17:00 - 18:20 PANEL OF THE FUNDING AGENCIES CHAIR: Antonio Zampolli (ILC) PANELISTS: Roberto Cencioni (EC), Ron Larsen (ARPA), Gary Strong (NSF) DISCUSSANTS: Nuria Bel (FBG), Ralph Grishman (NYU), Nancy Ide (Vassar College), Joseph Mariani (LIMSI), Nick Ostler (Linguacubun) 18:20 - 19:30 PANEL ON COOPERATION BETWEEN EU AND OTHER COUNTRIES IN THE FIELD OF LANGUAGE RESOURCES AND EVALUATION CHAIR: Mr. Alain Servantie (DG XIII-INCO) PANELISTS: Eva Hajicova (Charles University, Prague), Dan Tufis (Romanian Academy), Klara Vicsi (Technical Univ. of Budapest), Zygmunt Vetulani (Adam Mickiewicz University, Poznan), Mohamed Chad (University of Fez), Salem Ghazali (IRSIT, Tunis), Daniel Martin Mayorga (Telefónica Argentina) 20:00 Welcome reception. Capilla Colegio Máximo de Cartuja. Universidad de Granada and Real Academia de Ciencias Exactas, Físicas y Naturales. 22:30 Visit to the Alhambra. Consejería de Cultura de la Junta de Andalucía and Patronato de la Alhambra y Generalife. MAY 29, 1998 9:00 - 9:40 2 KEYNOTE SPEAKERS IN PARALLEL * Nicoletta Calzolari and Harald Höge Spoken & Written Language Resources in Europe: Spoken Language Resources for Voice Driven Man Machine Interfaces, H. Höge An Overview on Written Language Resources in Europe: a few Reflections, Facts, and a Vision, N. Calzolari * Margaret King and Bente Maegaard Issues in Natural Language Systems Evaluation, M. King and B. Maegaard 9:40 - 10:40 SESSIONS I, J, K, L IN PARALLEL SESSION I: LEXICAL PROJECTS (1) 9:40 - 10:00 Dan Tufis, Nancy Ide, Tomaz Erjavec, Standardised Specifications, development and Assessment of Large Morpho-Lexical Resources for Six Central and Eastern European Languages 10:00 - 10:20 Nilda Ruimy, Ornella Corazzari , Elisabetta Gola , Antonietta Spanu, Nicoletta Calzolari , Antonio Zampolli, European LE-PAROLE Project: The Italian Syntactic Lexicon 10:20 - 10:40 Anna Braasch, Anni Buhr Christensen, Sussi Olsen, Bolette S. Pedersen., A Large scale lexicon for Danish in the Information Society SESSION J: EVALUATION OF TOOLS & TOOLS FOR EVALUATION IN NLP 9:40 - 10:00 Patrizia Paggio and Bradley Music, Evaluation In SCARRIE 10:00 - 10:20 Emmanuelle Rodier, Semi Automatic Generation of Reference Diagnostics within an Evaluation Tool for Simplified English Checkers 10:20 - 10:40 Langlais Ph., Simard M., Theron P., Bonhomme P., Souissi E., Isabelle P., Armstrong S., Debili F., Veronis J., The ARC-A2 A Cooperative Research Project on Bilingual Text Alignment SESSION K: SPEECH PROCESSING AND EVALUATION 9:40 - 10:00 George Zavaliagkos, Utilizing untranscribed training data to improve performance (Invited Talk) 10:00 - 10:20 Lynnette Hirshman, Reading Comprehension: A Grand Challenge for Language Understanding (Invited Talk) 10:20 - 10:40 David S. Pallett, The NIST role in automatic speech recognition benchmark texts (Invited Talk) SESSION L: SPOKEN LANGUAGE RESOURCES PROJECTS (1) 9:40 - 10:00 Florian Schiel, Speech And Speech-Related Resources at BAS 10:00 - 10:20 J.C. Roux, Saspeech: Establishing Speech Resources for the Indigenous Languages of South Africa 10:20 - 10:40 Shuichi Itahashi, On Speech and Text Database Activities in Japan 10:40 - 11:00 COFFEE BREAK 11:00 - 12:00 SESSIONS I, J, K, L CONTINUED SESSION I: LEXICAL PROJECTS (1) 11:00 - 11:20 Svetlana Sheremetyeva, Jim Cowie, Sergei Nirenburg and Remi Zajac, Multilingual Onomasticon as a Multipurpose NLP Resource 11:20 - 11:40 Rémi Zajac, The Habanera Lexical Knowledge Management System 11:40 - 12:00 Masumi Narita, Language Resources for "Writer Helper» SESSION J: EVALUATION OF TOOLS & TOOLS FOR EVALUATION IN NLP 11:00 - 11:20 Thierry Declerck and Judith Klein, Evaluation of the NLP Components of an Information Extraction System for German 11:20 - 11:40 Didier Bourigault, Benoit Habert, Evaluating Terminology Extraction Systems: theoretical backgrounds and an experiment 11:40 - 12:00 Yves Simon, Chantal Enguehard, Jean Francois Hue, COMET a system to fill the lexical gaps by means of metaphor SESSION K: SPEECH PROCESSING AND EVALUATION 11:00 - 11:20 Mark A. Przybocki, Alvin Martin, NIST Speaker recognition evaluations 11:20 - 11:40 Steven Wegmann, Dragon Systems' Automatic Transcription System for the New TDT Corpus 11:40 - 12:00 Ron Larsen, to be announced (Invited Talk) SESSION L: SPOKEN LANGUAGE RESOURCES PROJECTS (1) 11:00 - 11:20 Christoph Draxler, Henk van den Heuvel, Herbert S.Tropf, SpeechDat Experiences in creating Large Multilingual Speech Databases for Teleservices 11:20 - 11:40 Asuncion Moreno, Harald Hoege, Joachim Koehler, Jose B.Marino, SpeechDat Across Latin America 11:40 - 12:00 P. Roach, S. Arnfield, W. Barry, S. Dimitrova, M. Boldea, A. Fourcin, W. Gonet, R. Gubrynowicz, E. Hallum, L. Lamel, K. Marasek, A. Marchal, E. Meister, K. Vicsi, Babel: A Database of central and eastern european languages 12:00 - 13:20 POSTER SESSIONS 1, 2, 3, 4 IN PARALLEL POSTER SESSION 1: LEXICON Elena Paskaleva, The Lexical Resources of highly inflected Slavonic languages in European standards and implementation formats Elena Barcena, Tim Read & Ricardo Mairal, Building a lexical reference system: something old, something new, something borrowed, something blue Aduriz I., Aldezabal I., Ansa O., Artola X., Diaz de Ilarraza A., Insausti J.M., EDBL: a multi-purposed lexical support for treatment of Basque Maite Melero, Marta Villegas, Issues on the Syntactic Encoding of a Computational Lexicon Ricarda Dormeyer, Ingrid Fischer, Building Lexicons out of a Database for Idioms Jesus-Luis Cunchillos Ilarri and Raquel Cervignon Moreno, Analizador Morfologico Ugaritico (AMU) Toni Tuells, Constructing and updating the lexicon of a two-level morphological analyzer from a Machine-Readable Dictionary Uwe Quasthoff, Tools for Automatic Lexicon Maintenance: Acquisition, Error Correction, and the Generation of Missing Values Pedro L. Diez-Orzas, Antonietta Alonge, Exploiting Data from EuroWordNet database for Industrial Application Eugenio Picchi, Exploiting Language Resources and Linguistic Tools for Multilingual Information Retrieval: The EUROSEARCH Approach Alessandro Artale, Anna Goy, Bernardo Magnini, Emanuele Pianta, Carlo Strapparava, Issues in The Development Cycle of the Italian Version of WordNet Ana Garcia-Serrano and Jesus Contreras, A Computational Platform for Ugaritic Morphological Analysis Martin Hoelter, Rolf Wilkens, CCSD-Online - An English online-dictionary with multimedia extensions POSTER SESSION 2: EVALUATION IN NLP (1) Natalia Brines-Moya, Julie Hartill, Criteria for user-oriented evaluation of monolingual text corpora interfaces Judith L. Klavans, Kathleen McKeown, Min-Yen Kan, Susan Lee, Resources for evaluation of summarization techniques Tibor Kiss, Daniela Steinbrecher, Lexical Replacement in Test Suites for the Evaluation of Natural Language Applications N. Belmore, Automated Procedures for evaluating tagging Josep Carmona, Sergi Cervell, M.Antonia Marti, Lluis Marquez, Lluis Padro, Roberto Placer, Horacio Rodriguez, Mariona Taule, Jordi Turmo, An Environment for Morphosyntactic Processing of Unrestricted Spanish text Guido Boella and Leonardo Lesmo, Automatic Refinement of Linguistic Rules for Tagging Jan Hajic, Barbora Hladka, Czech Language Processing / POS Tagging Vilson J. Leffa, Clause Processing in Complex Sentences Bernd Geistert, Manuela Boros, Ute Ehrlich, Administration of large grammar resources - Design and implementation of a 'Grammar Pool' Janusz S. Bien, Evaluating analysers of Polish POSTER SESSION 3: NL CORPUS Toni Badia, Manel Pujol, Antoni Tuells, Jordi Vivaldi, Lluís de Yzaguirre, Teresa Cabré, IULA's LSP Multilingual Corpus: compilation and processing J.G. Kruyt, Dutch written language resources, their users and uses Charles Fillmore, Nancy Ide, Dan Jurafsky, Catherine Macleod, An American National Corpus: A Proposal Tomaz Erjavec, Nancy Ide, The MULTEXT-East Corpus Susan Armstrong, Masja Kempen, David McKelvie, Dominique Petitpierre, Reinhard Rapp, Henry S. Thompson, Multilingual Corpora for Cooperation Luisa Alice Santos Pereira, Corpus De Referencia do Portugues Contemporaneo Tomaz Erjavec, Ann Lawson, Laurent Romary, East meets West: Producing Multilingual Resources in a European Context Marie-Paule Pery-Woodley, Josette Rebeyrolle, Domain and Genre in sublanguage text: definitional microtexts in three corpora Anne Abeillé, Lionel Clément, Rodrigo Reyès, TALANA Annotated Corpus: the first results Giacomo Ferrari, Preliminary steps towards the creation of a Discourse and Text Resource Joseba Abaitua, Arantza Casillas, Raquel Martinez, Value added Tagging for Multilingual resources management Jesus-Luis Cunchillos Ilarri y Joaquin Siabra, Herramienta para el Tratamiento critico de textos. Desarrollo del modulo basico (modulo-1) Teruo Koyama, Masaharu Yoshioka, Kyo Kageura, The Construction of a Lexically Motivated Corpus - The Problem of Defining Proper Lexical Unit David Day, John Aberdeen, Sasha Caskey, Lynette Hirschman, Patricia Robinson and Marc Vilain, Alembic Workbench Corpus Development Tool POSTER SESSION 4: SPEECH DATABASES AND PHONETIC LEXICA Ryszard Gubrynowicz, The Polish database of spoken language Ute Ziegenhain, Steffen Harengel, Janez Kaiser, Ralph Wilhelm, Creating Large Pronunciation Lexica for Speech Applications Maria Fernanda Bacelar do Nascimento, Portugues falado, variedades geograficas e sociais - Program LINGUA/SOCRATES Robert Neumann, The Historical Yiddish Language Resource: The Archives of the Language and Culture Atlas of Ashkenazic Jewry Susanne Burger, Christoph Draxler, Identifying Dialects of German from Digit Strings Stefan Grocholewski, First Database for Spoken Polish Tatiana Y. Sherstinova, Speech Evaluation in Russian Phonetic Database Jong-mi Kim, Stephen A. Dyer, Dwight Day, Construction of a Speech Translation database Anja Elsner, Thomas Portele, Monika Rauth, Gerit Sonntag, Maria Wolters, Constructing a prosodic database for American English Susanne Burger, Florian Schiel, RVG 1 - A Database for Regional Variants of Contemporary Spoken German Simon Dobrisek, Jerneja Gros, France Mihelic, Nikola Pavesic, Recording and Labelling of the GOPOLIS Slovenian Speech Database Hartmut R. Pfitzinger, The Collection of spoken language resources in car environments Javier Ortega Garcia, Joaquin Gonzalez-Rodriguez, Victoria Marrero-Aguiar, Juan J.Diaz-Gomez, Ramon Garcia-Jimenez, Jose Lucena Molina, Jose A.G. Sanchez-Molero, Speaker recognition-oriented 'Ahumada' large speech corpus D.Langmann, T.Schneider, R.Grudszus, A.Fischer, T.Crull, CSDC - The MoTiV Car-Speech Data Collection L.F. Lamel, G. Adda, M. Adda-Decker, C. Corredor, J.J. Gangolf, J.L. Gauvain, A Multilingual Corpus for Language Identification Vera Semanova-Fluhr, Language Systems and Resources in Russia Martine de Calmes and Guy Perennou, BDLex: a lexicon for Spoken & Written French 13:20 - 14:40 LUNCH 14:40 - 16:40 SESSIONS M, N, O, P, Q IN PARALLEL SESSION M: LEXICAL PROJECTS (2): SEMANTIC NETS 14:40 - 15:00 Adriana Roventini, Nicoletta Calzolari, Carol Peters, Building a Semantic Network for Italian using Existing Lexical resources 15:00 - 15:20 Antonietta Alonge, Data on Verb Semantics in the EuroWordNet Database 15:20 - 15:40 Bonnie Dorr, M. Antonia Marti and Irene Castellon, Evaluation of LCS- and EuroWordNet-Based Lexical Resources for Machine Translation 15:40 - 16:00 Piek Vossen and Laura Bloksma, Categories and classifications in EuroWordNet 16:00 - 16:20 Wim Peters, Piek Vossen, The Reduction of Semantic Ambiguity in Linguistic Resources 16:20 - 16:40 Charles J. Fillmore, Beryl T. S. Atkins, FrameNet and Lexicographic Reference SESSION N: EVALUATION: TOKENIZERS, TAGGERS, PARSERS 14:40 - 15:00 B. Habert, G. Adda, M. Adda-Decker, P. Boula de Mareuil, S. Ferrari, O. Ferret, G. Illouz, P. Paroubek, The Need for Tokenization evaluation 15:00 - 15:20 Patrick Paroubek, Gilles Adda, Joseph Mariani, Josette Lecomte, Martin Rajman, The GRACE French Part-Of-Speech Tagging Evaluation Task 15:20 - 15:40 Josette Lecomte, Nadine Lucas, Martin Rajman, Linguistic Issues in GRACE (evaluation of Part-Of-Speech tagging for French) 15:40 - 16:00 Marc Bertier, Genevieve Lallich-Boidin, A Paradox Raised by the Evaluation of Taggers 16:00 - 16:20 Martin Wynne, Roger Garside, Geoffrey Leech, Andrew Wilson, Parallel Wordclass Tagging 16:20 - 16:40 John Carroll, Ted Briscoe, Antonio Sanfilippo, Parser Evaluation: a Survey and a New Proposal SESSION O: NL CORPUS PROJECTS 14:40 - 15:00 Koichi Hashida, Hitoshi Isahara, Takenobu Tokunaga, Minako Hashimoto, Shiho Ogino, Wakako Kashino, RWC text database 15:00 - 15:20 Nancy Ide, Corpus Encoding Standard: SGML guidelines for Encoding Linguistic Corpora 15:20 - 15:40 Hitoshi Isahara, JEIDA's English-Japanese Bilingual Corpus Project 15:40 - 16:00 Diana Santos, Providing access to language resources through the WorldWideWeb: the Oslo corpus of Bosnian Texts 16:00 - 16:20 Dan Cristea, Nancy Ide, Laurent Romary, Marking-up multiple views on a text: discourse and reference 16:20 - 16:40 Michel Simard, The BAF: A Corpus of English-French Bitext SESSION P: SPOKEN LANGUAGE RESOURCES PROJECTS (2) 14:40 - 15:00 Jesus E. Diaz, Antonio M. Peinado, Antonio J. Rubio, E. Segarra, N. Prieto, F. Casacuberta, Albayzin: a task-oriented Spanish speech corpus 15:00 - 15:20 Federico Albano Leoni, Andrea Paoloni, Mario Refice, Rinaldo, Alberto Sobrero, CLIP Corpus della Lingua Italiana Parlata (Corpus of Spoken Italian) SESSION Q: LANGUAGE RESOURCES: STRATEGIC ISSUES 15:20 - 15:40 Ron Cole, Language Resources for Everyone (Invited Talk) 15:40 - 16:00 Gosse Bouma, Ineke Schuurman, Intergovernmental language policy and the evaluation of resources, tools and end products for Dutch 16:00 - 16:20 David Brooks, Language Resources and International Product Strategy (Invited Talk) 16:20 - 16:40 Giovanni Varile, Future Perspectives in Human Language Technology (Invited Talk) 16:40 - 17:00 COFFEE BREAK 17:00 - 18:40 2 PANELS IN PARALLEL * EAGLES PANEL ON LEXICAL SEMANTIC STANDARDS FOR INFORMATION SYSTEMS CHAIR: Antonio Sanfilippo (Sharp) PANELISTS: Nicoletta Calzolari (ILC), Patrick Saint-Dizier (IRIT), Piek Vossen (Amsterdam Univ.), Robert Gauzauskas (Sheffield Univ.), Sophia Anianadou (Manchester Metropolitan Univ.) DISCUSSANTS: Eduard Hovy (USC), Ralph Grishman (NYU), Sergei Nirenburg (EMU), Lin Chase (LIMSI-CNRS) * INDUSTRIAL AND R&D USE OF LANGUAGE RESOURCES CHAIR: Khalid Choukri (ELRA) PANELISTS: D. Brooks (Microsoft, USA), J.P. Chanod (Xerox, France), C. Cirilli (Synthema, Italy), M. Hunt (Dragon, UK), I. Johnson (Sharp, UK), S. Kunzmann (IBM-Europe, Germany), N. Lenke (PHILIPS, Germany), J. Odijk (Lernout & Hauspie Speech products, Belgium) MAY 30, 1998 9:00 - 9:40 2 KEYNOTE SPEAKERS IN PARALLEL * Donna Harman and Greg Grefenstette The Text REtrieval Conferences (TRECs) and the Cross-Language Track, D. Harman Problems and Techniques for Cross Language Information Retrieval, G. Grefenstette * Christian Dugast and Lori Lamel Issues in Man-Machine Spoken Dialogues 9:40 - 10:40 SESSIONS R, S, T, U IN PARALLEL SESSION R: ONTOLOGIES & KNOWLEDGE BASES 9:40 - 10:00 Nicola Guarino, Some Ontological Principles for the Design of Upper Level Lexical Resources 10:00 - 10:20 Eduard Hovy, Combining and Standardizing Large-Scale, Practical Ontologies for Machine Translation and Other Uses 10:20 - 10:40 Philippe Alcouffe, From Thematic index to semantic links: querying multimedia reference CD-ROMs as knowledge bases SESSION S: EVALUATION IN NLP: TASKS & COMPONENTS 9:40 - 10:00 Robert Dale, Chris Mellish, Issues in Evaluating Natural language Generation 10:00 - 10:20 Amit Bagga, Evaluation Of Coreferences and Coreference Resolution Systems 10:20 - 10:40 Andrei Popescu-Belis, How Corpora with Annotated Coreference Links Improve Anaphora and Reference Resolution SESSION T: TOOLS FOR NLP 9:40 - 10:00 Dan Tufis, Oliver Mason, Tagging Romanian Texts: A Case Study for QTAG, a Language Independent POS-Tagger 10:00 - 10:20 Ferran Pla, Natividad Prieto, Using Grammatical Inference Methods for Automatic Part-of-Speech Tagging 10:20 - 10:40 Irene Castellon, Montse Civit, Jordi Atserias, Syntactic Parsing of Unrestricted Spanish Text SESSION U: SPOKEN LANGUAGE SYSTEMS EVALUATION 9:40 - 10:00 Louis C.W. Pols, Jan P.H. van Santen, Masanobu Abe, Dan Kahn, Eric Keller, The Use of large text corpora for evaluating text-to-speech systems 10:00 - 10:20 P. Boula de Mareuil, F. Yvon, C. d'Alessandro, V. Auberg‚, M. Bagein G. Bailly, F. Bechet, S. Foukia, J.-P. Goldman, E. Keller, D. O'Shaughnessy, V. Pagel, F. Sannier, J. Veronis, B. Zellner, Objective evaluation methodology of grapheme-to phoneme conversion for text-to-speech synthesis in French 10:20 - 10:40 Yann Morlec, Albert Rilliard, Gerard Bailly and Veronique Auberge, Evaluating The adequacy of synthetic prosody in signaling syntactic boundaries: methodology and first results 10:40 - 11:00 COFFEE BREAK 11:00 - 12:00 SESSIONS S, T, U CONTINUED SESSION S: EVALUATION IN NLP: TASKS & COMPONENTS 11:00 - 11:20 K. Netter, S. Armstrong, T. Kiss, J. Klein, S. Lehmann, D. Milward, D. Petitpierre, S. Pulman, S. Regnier-Prost, R. Schaler, H. Uszkoreit, T. Wegst, DIET - Diagnostic and Evaluation Tools for Natural Language Applications 11:20 - 11:40 Adam Kilgarriff, Gold Standard Resources for Evaluating Word Sense Disambiguation Programs 11:40 - 12:00 Seung Hyun Yang, Young-Sum Kim, A Quantitative Measure or Quality for Evaluating Sentences Based on Genetic Algorithm SESSION T: TOOLS FOR NLP 11:00 - 11:20 I. Prodanof, A. Cappelli, L. Moretti, M. Carenini, P. Moreschini, M. Vanocchi, A Grammar development environment for reusable and easily customizable NL applications 11:20 - 11:40 Alberto Lavelli, Fabio Pianesi, Developing Language Resources and Applications with Geppetto 11:40 - 12:00 Nabil Hathout & Fiammetta Namer , Automatic Construction and Validation of French Large Lexical Resources. Reuse of Verb Theoretical Linguistic Descriptions SESSION U: SPOKEN LANGUAGE SYSTEMS EVALUATION 11:00 - 11:20 Jerneja Gros, France Mihelic and Nikola Pavesic, Speech Quality Evaluation in Slovenia TTS 11:20 - 11:40 Lise van Haaren, Marc Blasband, Marinel Gerritsen, Marcha van Schijndel, Evaluating Quality of Speech Recognition Systems: Comparing a Technology-focused and a User-focused Approach 12:00 - 13:20 POSTER SESSIONS 5, 6, 7, 8 IN PARALLEL POSTER SESSION 5: LEXICA, CORPUS, TERMINOLOGY Catherine Macleod, Ralph Grishman, and Adam Meyers, Dictionaries and Balanced Corpora: The interdependence of resources Alexandre Zoubov, Computer fund of Byelorussian Serge A. Yablonsky, Russicon Russian Monolingual and Multilingual Language Resources, Software and Applications Heiki-Jaan Kaalep, Rene Prillop, Epp Ehasalu, The Role of Internet in Creating, Financing and Integrating Language Resources Henri Zingle, From Linguistic resources to applications with the Zstation Christian Galinski, Terminology Infrastructures and the terminology market in Europe G. Negrini, T. Farnesi, An On-line reference system to manage terminological resources Luciana Bordoni, An Experience at ENEA for Building a Specialized Thesaurus André Le Meur, GENETER : un format générique pour la diffusion et la réutilisation de données terminologiques hétérogènes Alessandra Fazio, Francesca Peruzzi, Development of a tool for the organization of sports terminology Sharon Denness, Evaluating Terminology for technical authoring Mustafa-Elhadi Widad & Jouis Christophe, Terminology Extraction and acquisition from textual data: criteria for evaluating tools and methods Archibald Michiels & Nicolas Dufour, DEFI: a Tool for Automatic Multi-Word Unit Recognition, Meaning Assignment and Translation Selection POSTER SESSION 6: EVALUATION IN NLP(2) Luca Dini, Vittorio Di Tomaso, Frederique Segond, Rule based semantic tagging Alberto Diaz Esteban, Manuel de Buenaga Rodriguez, L.Alfonso Urena Lopez, Manuel Garcia Vega, Integrating linguistic resources in a uniform way for text classification tasks Eleni Efthimiou and Christina Alexandri, On The Treatment of Extra-linguistic Knowledge in Grammar Resources Bruno Landi, Patrick Kremer, Laurent Schmitt, AMARYLLIS: an evaluation experiment on search engine in a French-speaking context Hanmin Jung, Sanghwa Yuh, Chul-Min Sim, Taewan Kim, Dong-In Park, Domain Identifier for the Use of Balanced Web Documents Bernhard Staudinger and Nancy Smith, Some Problems in the Evaluation of the Russian-German Machine Translation System MIROSLAV Davide Turcato, Fred Popowich, Olivier Laurens, Paul McFetridge, John Grayson, Re-use of linguistic resources in MT Chadia Moghrabi, Christian Boitet, Using GETA's MT/NLP Tools as Expert Module in an intelligent Tutoring System for French André Jean-Marc Loechel, Laura Garcia Vitoria, De l'utilisation de l'internet a la realisation de pages web comme support principal de didactique des languages Reinhard Schaler, Localisation is good for you. The localisation resources Centre as an example for the successful co-operation between industrial users and academic researchers POSTER SESSION 7: APPLICATIONS Florence Reeder, Trainer Beware: Corpora for Language/Encoding Identification Joerg Schuetz and Rita Nuebel, Multi-Purpose vs. Specific Application: Diagnostic Evaluation of Multilingual Language Technologies Eugenio Picchi, CiBIT: Biblioteca Telematica Italiana. A Digital Library for the Italian Cultural Heritage Claude de Loupy, Marc El-Beze, Pierre-Francois Marteau, Word Sense Disambiguation using HMM Tagger Amit Bagga, Evaluation Of Coreferences and Coreference Resolution Systems Laurent Fischer, An Approach to linguistic knowledge discovery assistants Josep Carmona, Sergi Cervell, M.Antonia Marti, Lluis Marquez, Lluis Padro, Roberto Placer, Horacio Rodriguez, Mariona Taule, Jordi Turmo, An Environment for Morphosyntactic Processing of Unrestricted Spanish text POSTER SESSION 8: TOOLS & FORMAT FOR SPOKEN LRs Ivan Kopecek, Automatic Segmentation into Syllable Segments Geoffrey Sampson, Consistent Annotation of Speech-Repair Structures Pavel Fryda, Ivan Kopecek, PHC Format for Managing Data in Phonetic Corpora Maxine Eskenazi, Robert Frederking, Issues in Database design: recording and processing speech from new populations Florian Schiel, Susanne Burger, Anja Geumann, Karl Weilhammer, The Partitur Format at BAS J. Bruce Millar, A Structure for Comprehensive Spoken Language Description Nick Campbell, Design of Speech Corpora for use in Concatenative Synthesis Systems Christoph Draxler, WWWSigTranscribe - AN EXTENSION OF THE WWWTranscribe TOOLBOX Klare Vicsi and Attila Vig, Language Independent Automatic Segmentation Technique using Sampa Labelling of Phonemes José A.R. Fonollosa, Asuncion Moreno, Automatic Database Acquisition Software for ISDN PC Cards and Analogic Boards Odile Mella and Dominique Fohr, Two Tools for semi-automatic phonetic labelling of large corpora Youngkil Kim, Sanghwa Yuh, Hanmin Jung, Taewan Kim, Dong-In Park, Automatic Extraction of Illocutionary Forces' Types in a Dialogue, Based on Context and Modal Information Eva Knodt, Jared Bernstein, Ognien Todic, A Protocol for Collecting a Corpus of Spontaneous, Conversational, Hispanic English Rolf Wilkens, Martin Hoelter, EYDES Transcription Workbench - Bidirectional transcription of Yiddish spoken text Karl Weilhammer, Susanne Burger, Characterizing a Database of spoken German by techniques of data mining Albino Nogueiras Rodriguez and Asuncion Moreno Bilbao, NaniBd: a Set of Tools for Transcribing and Validating Speech Databases Brian MacWhinney and Steven Gillis, The CHILDES System Claude Barras, Edouard Geoffrois, Zhibiao Wu, Mark Liberman, Transcriber: a Free Tool for Segmenting, Labeling and Transcribing Speech 13:20 - 14:40 LUNCH 14:40 - 16:40 SESSIONS W, X, Y, Z IN PARALLEL SESSION W: TERMINOLOGY 14:40 - 15:00 B. Habert, A. Nazarenko, P. Zweigenbaum and J. Bouaud, Extending an existing specialized lexicon 15:00 - 15:20 Beatrice Daille, Christian Jacquemin, Lexical Database and information access: a fruitful association? 15:20 - 15:40 Thierry Hamon, Adeline Nazarenko, Using General semantic information to help the terminology structuration 15:40 - 16:00 Diana Maynard and Sophia Ananiadou, Term Sense Disambiguation Using a Domain-Specific Thesaurus 16:00 - 16:20 Rochdi Oueslati, A Corpus-based method for linguistic knowledge extraction and its evaluation 16:20 - 16:40 Ute Ehrlich, Automatic Extraction of a Unique Terminology Based on Multilingual Corpus and Dictionary SESSION X: TREEBANKS 14:40 - 15:00 Petra Maier-Meyer, Juergen Oesterle, The GNoP (German Noun Phrase) Treebank 15:00 - 15:20 Brigitte Krenn, Wojciech Skut, Thorsten Brants, Construction of a Linguistically Interpreted German Newspaper Corpus 15:20 - 15:40 Eva Hajicova, Jarmila Panevova, Language Resources Need Annotations to Make them Really Reusable: The Prague Tree Bank 15:40 - 16:00 Sadao Kuroashi and Makoto Nagao, Building a Japanese Parsed Corpus while Improving the Parsing System 16:00 - 16:20 Hsin-Hsi Chen and Min-Shin Shaw, A Treebank Development Tool 16:20 - 16:40 Akira Ichikawa, Masahiro Araki, Yasuo Horiuchi, Masato Ishizaki, Shuichi Itabashi, Toshihiko Ito, Hideki Kashioka, Keiji Kato, Hideaki Kikuchi, Hanae Koiso, Tomoko Kumagai, Akira Kurematsu, Kikuo Maekawa, Katsuhiro Murakami, Shu Nakazato, Yoichi Yamashita, Masafumi Tamoto, Syun Tutiya, Takashi Yoshimura, Standardising Annotation Schemes for Japanese Discourse SESSION Y: MULTILINGUAL ISSUES 14:40 - 15:00 Sergei Nirenburg, Project Boas: "A Linguist in the Box» as a Multi-Purpose Language Resource 15:00 - 15:20 Jeff Allen, Christopher Hogan, Expanding lexical coverage of parallel corpora for the EBMT approach 15:20 - 15:40 Gregory Grefenstette, Evaluating The Adequacy of a Multilingual Transfer Dictionary for the Cross Language Information Retrieval Task 15:40 - 16:00 Bonnie Dorr, Douglas W. Oard, Evaluating Resources for Query Translation in Cross-Language Information Retrieval 16:00 - 16:20 Ingeborg Blank, Computer-aided analysis of multilingual patent texts 16:20 - 16:40 Thomas Schneider, Multilingual Information Processing: the AVENTINUS Project SESSION Z: SPOKEN LANGUAGE SYSTEMS AND EVALUATION 14:40 - 15:00 J.M. Dolmazon, F. Bimbot, G. Adda, M. El-Beze, J-C. Caerou, J. Zeiliger, M. Adda-Decker, An Overview of the first evaluation campaign for speech dictation systems in French 15:00 - 15:20 M. Adda-Decker, G. Adda, L. Lamel, J.-L. Gauvain, On evaluating speech and text corpora for French speech recognition 15:20 - 15:40 Lin L. Chase, A Review of the American Switchboard and Callhome Speech Recognition Evaluation Programs 15:40 - 16:00 P. Garcia, A.J. Rubio, J. Diaz-Verdejo, M.C. Benitez, J.M. Lopez-Soler, On The Comparison of Speech recognition Tasks 16:00 - 16:20 M. Jardino, F. Bimbot, S. Igounet, K. Smaili, I. Zitouni, M. El-Beze, A First Evaluation Campaign for Language Models 16:20 - 16:40 Lucian Galescu, Eric Ringger, James Allen, Rapid Language Model Development for New Task Domains 16:40 - 17:00 COFFEE BREAK 17:00 - 18:30 CLOSING SESSION SUMMARY & OUTCOME 21:00 Social dinner. Carmen de los Mártires (Coro Rociero Arrayanes) MAY 31, 1998 Optional visit to Montefrío. Ayuntamiento de Montefrío and Cooperativa Olivarera San Francisco Asís