MAY 31, 2002

 

 

 

9:00-9:40  KEYNOTES

James Pustejovsky, Creating Domain-specific Information Servers. – Room Camara

Gianni Lazzari, Speech to Speech Translation: Present and Future Challenges. – Room Atlantico

 

9:40-9:45  5 MIN. BREAK

 

9:45-11:05  SESSIONS IN PARALLEL: SO7, WO16, WO17, WO18, EO5

 

SESSION SO7: TOOLS FOR SPOKEN LRs -  Room Camara

 

9:45-10:05

Hélène François, Olivier Boëffard, The Greedy Algorithm and its Application to the Construction of a Continuous Speech Database.

10:05-10:25

Ricardo Ribeiro, Luís Oliveira, Isabel Trancoso, Morphosyntactic Disambiguation for TTS Systems.

10:25-10:45

Jean-Pierre Martens, Diana Binnenpoorte, Kris Demuynck, Ruben Van Parys, Tom Laureys, Wim Goedertier, Jacques Duchateau, Word Segmentation in the Spoken Dutch Corpus.

10:45-11:05

Akinobu Lee, Tatsuya Kawahara, Kazuya Takeda, Masato Mimura, Atsushi Yamada, Akinori Ito, Katsunobu Itou, Kiyohiro Shikano, Continuous Speech Recognition Consortium  an Open Repository for CSR Tools and Models.

 

SESSION WO16: APPLICATIONS BASED ON WRITTEN LRs – Room Atlantico

 

9:45-10:05

Silja Huttunen, Roman Yangarber, Ralph Grishman, Diversity of Scenarios in Information extraction.

10:05-10:25

René Schneider, n-grams of Seeds: A Hybrid System for Corpus-Based Text Summarization.

10:25-10:45

Sanda Harabagiu, Finley Lacatusu, Paul Morarescu, Multidocument Summarization with GISTexter.

 

10:45-11:05

Alessandro Lenci, Roberto Bartolini, Nicoletta Calzolari, Ana Agua, Stephan Busemann, Emmanuel Cartier, Karine Chevreau, José Coch, Multilingual Summarization by Integrating Linguistic Resources in the MLIS-MUSI Project.

 

SESSION WO17: SEMANTIC LEXICONS – Room Tenerife

9:45-10:05

Adriana Roventini, Marisa Ulivieri, Nicoletta Calzolari, Integrating Two Semantic Lexicons, SIMPLE and ItalWordNet: What Can We Gain?

10:05-10:25

Nabil Hathout, From WordNet to CELEX: acquiring morphological links from dictionaries of synonyms.

10:25-10:45

Claudia Kunze, Lothar Lemnitzer, GermanNet - representation, visualization, application.

10:45-11:05

Jerker Järborg, Dimitrios Kokkinakis, Maria Toporowska Gronostaj, Lexical and Textual Resources for Sense Recognition and Description.

 

SESSION WO18: SYNTACTIC ANNOTATION – Room Lanzarote

 

9:45-10:05

Ted Briscoe, John Carroll, Robust Accurate Statistical Annotation of General Text.

10:05-10:25

Erhard W. Hinrichs, Sandra Kübler, Frank H. Müller, Tylman Ule, A Hybrid Architecture for Robust Parsing of German.

10:25-10:45

Zdeněk Žabokrtský, Petr Sgall, Sašo Džeroski, A Machine Learning Approach to Automatic Functor Assignment in the Prague Dependency Treebank.

10:45-11:05

Roberto Bartolini, Alessandro Lenci, Simonetta Montemagni, Vito Pirrelli, The Lexicon-Grammar Balance in Robust Parsing of Italian.

 

SESSION EO5: LEXICAL EVALUATION – Room La Graciosa

 

9:45-10:05

Darren Pearce, A Comparative Evaluation of Collocation Extraction Techniques.

 

10:05-10:25

Romaric Besançon, Martin Rajman, Evaluation of a Vector Space Similarity Measure in a Multilingual Framework.

10:25-10:45

Thierry Hamon, Olivier Hû, How to evaluate necessary cooperative systems of terminology building?.

10:45-11:05

Judita Preiss, Anna Korhonen, Ted Briscoe, Subcategorization Acquisition as an Evaluation Method for WSD.

 

11:05-11:20  COFFEE BREAK

 

11:20-12:40  POSTERS SESSIONS IN PARALLEL: SP3, WP4, WP5, WP6, TP1

 

POSTERS SP3: ANNOTATION TOOLS: FROM SPEECH SEGMENTS TO DIALOGUES – Poster Area

 

Doroteo Torre Toledano, Luis A. Hernández Gómez, HMMs for Automatic Phonetic Segmentation.

Tom Laureys, Kris Demuynck, Jacques Duchateau, Patrick Wambacq, An Improved Algorithm for the Automatic Segmentation of Speech Corpora.

Thorsten Trippel, Dafydd Gibbon, Annotation Driven Concordancing: the PAX Toolkit.

K. López de Ipiña, N. Ezeiza, G. Bordel, Automatic Morphological Segmentation for Continuous Speech Recognition of Basque.

Carlos D. Martínez-Hinarejos, Emilio Sanchís, Fernando García-Granada, Pablo Aibar, A Labelling Proposal to Annotate Dialogues.

Claudia Sassen, Dafydd Gibbon, Enhanced Dialogue Markup for Crisis Talk Scenario Resources.

Petra Geutner, Frank Steffens, Dietrich Manstetten, Design of the VICO Spoken Dialogue System: Evaluation of User Expectations by Wizard-of-Oz Experiments.

Laurence Devillers, Sophie Rosset, Hélèn Bonneau-Maynard, Lori Lamel, Annotations for Dynamic Diagnosis of the Dialog State.

Steve Whittaker, Marilyn Walker, Johanna Moore, Fish or Fowl: A Wizard of Oz Evaluation of Dialogue Strategies in the Restaurant Domain.

 

POSTERS WP4: CORPUS ANNOTATION - Poster Area

 

Nigel Collier, Koichi Takeuchi, PIA-Core: Semantic Annotation through Example-based Learning.

Tilly Dutilh, Truus Kruyt, Implementation and Evaluation of PAROLE PoS in a National Context.

Kiril Ribarov, Old Sources and Modern Procedures: Computer Processing of Old-Church Slavonic.

 

Susanne Salmon-Alt, Renata Vieira, Nominal Expressions in Multilingual Corpora: Definites and Demonstratives.

Chung-hye Han, Na-Rare Han, Eon-Suk Ko, Martha Palmer, Development and Evaluation of a Korean Treebank and its Application to NLP.

Sabine Brants, Silvia Hansen, Developments in the TIGER Annotation Scheme and their Realization in the Corpus.

X. Artola, A. Díaz de Ilarraza, N. Ezeiza, K. Gojenola, G. Hernández, A. Soroa, A Class Library for the Integration of NLP Tools: Definition and implementation of an Abstract Data Type Collection for the manipulation of SGML documents in a context of stand-off linguistic annotation.

Špela Vintar,  Paul Buitelaar, Bärbel Ripplinger, Bogdan Sacaleanu, Diana Raileanu, Detlef Prescher, An Efficient and Flexible Format for Linguistic and Semantic Annotation.

Nadia Mana, Ornella Corazzari, The Lexico-semantic Annotation of an Italian Treebank.

Scott Cotton, Steven Bird, An integrated framework for treebanks and multilayer annotations.

Paul Clough, Robert Gaizauskas, S. L. Piao, Building and annotating a corpus for the study of journalistic text reuse.

Gosse Bouma, Geert Kloosterman, Querying Dependency Treebanks in XML.

Toshifumi Tanabe, Yasuo Koyama, Kenji Yoshimura, Kosho Shudo, Modal Expressions in Natural Language Sentence and Their Similarity.

Susana Afonso, Eckhard Bick, Renato Haber, Diana Santos, "Floresta Sintá(c)tica": A treebank for Portuguese.

Ilona Steiner, Laura Kallmeyer, VIQTORYA -- A Visual Query Tool for Syntactically Annotated Corpora.

Aoife Cahill, Josef van Genabith, TTS - A Treebank Tool Suite.

Serge A. Yablonsky, Corpora as Object-Oriented System. From UML-notation to Implementation.

Harris Papageorgiou, Prokopis Prokopidis, Voula Giouli, Iason Demiros, Alexis Konstantinidis, Stelios Piperidis, Multi-level XML-based Corpus Annotation.

Kiril Simov, Petya Osenova, Milena Slavcheva, Sia Kolkovska, Elisaveta Balabanova, Dimitar Doikoff, Krassimira Ivanova, Alexander Simov, Milen Kouylekov, Building a Linguistically Interpreted Corpus of Bulgarian: the BulTreeBank.

Atsushi Fujii, Katunobu Itou, Tetsuya Ishikawa, Producing a Large-scale Encyclopedic Corpus over the Web.

 

POSTERS WP5: COMPONENTS & SYSTEMS - Poster Area

 

Chikashi Nobata, Satoshi Sekine, Hitoshi Isahara, Ralph Grishman, Summarization System Integrated with Named Entity Tagging and IE pattern Discovery.

Min-Yen Kan, Judith L. Klavans, Kathleen R. McKeown, Using the Annotated Bibliography as a Resource for Indicative Summarization.

Bolette S. Pedersen, Patrizia Paggio, Semantic Lexical Resources Applied to Content-based Querying - the OntoQuery Project.

Anna Sågvall Hein, Eva Forsbom, Jörg Tiedemann, Per Weijnitz, Ingrid Almqvist, Leif-Jöran Olsson, Sten Thaning, Scaling Up an MT Prototype for Industrial Use - Databases and Data Flow.

Barry Schiffman, Building a Resource for Evaluating the Importance of Sentences.

Constantin Orasan, Ramesh Krishnamurthy, A corpus-based investigation of junk emails.

Constantin Orasan, Building annotated resources for automatic text summarisation.

Andrea Bozzi, LAperLA: an integrated graphical-linguistic System for old printed Latin Texts.

Elaine Uí Dhonnchadha, A Two-level Morphological Analyser and Generator for Irish using Finite-State Transducers.

Nabil Hathout, Ludovic Tanguy, Webaffix: Discovering Morphological Links on the WWW.

Hannah Kermes, Stefan Evert, YAC - A Recursive Chunker for Unrestricted German Text.

Xavier Carreras, Lluís Padró, A Flexible Distributed Architecture for Natural Language Analyzers.

Satoshi Sekine, Kiyoshi Sudo, Chikashi Nobata, Extended Named Entity Hierarchy.

Yllias Chali, Experiments in Topic Detection.

 

POSTERS WP6: LRs & PROJECTS - Poster Area

 

Alejandro Bia, Manuel Sánchez Quero, Building ancient Spanish dictionaries for spell-checking of DL texts.

Choy-Kim Chuah, Zaharin Yusoff, Computational Linguistics at Universiti Sains Malaysia.

Carole Tiberius, Dunstan Brown, Greville Corbett, A typological database of agreement.

Fabio Tamburini, A dynamic model for reference corpora structure definition.

Yong-Ju Lee, Bong-Wan Kim, Yongnam Um, Speech Information Technology & Industry Promotion Center in Korea: Activities and Directions.

Catia Cucchiarini, Elisabeth D'Halleweyn, Lisanne Teunissen, A Human Language Technologies Platform for the Dutch language: awareness, management maintenance and distribution.

D. Binnenpoorte, F. De Vriend, J. Sturm, W. Daelemans, H. Strik, C. Cucchiarini, A Field Survey for Establishing Priorities in the Development of HLT Resources for Dutch.

Michael Rosner, The Future of Maltilex.

 

POSTERS TP1: TERMINOLOGY - Poster Area

 

Lorna Balkan, Ken Miller, Birgit Austin, Anne Etheridge, Myriam Garcia Bernabé, Pam Miller, ELSST: a broad-based Multilingual Thesaurus for the Social Sciences.

Marianne Dabbadie, Widad Mustafa El Hadi, Ismaïl Timimi, Terminological Enrichment for non-Interactive MT Evaluation.

Judit Feliu, Jorge Vivaldi, M. Teresa Cabré, Towards an Ontology for a Human Genome Knowledge Base.

Olivier Ferret, Christian Fluhr, Françoise Rousseau-Hans, Jean-Luc Simoni, Building domain specific lexical hierarchies from corpora.

James Dowdall, Michael Hess, Neeme Kahusk, Kaarel Kaljurand, Mare Koit, Fabio Rinaldi, Kadri Vider, Technical Terminology as a Critical Resource.

Sussi Olsen, Lemma selection in domain specific computational lexica - some specific problems.

Jörg Tiedemann, MatsLex - a Multilingual Lexical Database for Machine Translation.

 

12:40-13:40: SESSIONS IN PARALLEL: SO8, WO19, WO20, WO21, WO22.

 

SO8: ANNOTATION FRAMEWORKS & TOOLS – Room Camara

 

12:40-13:00

Kazauki Maeda, Steven Bird, Xiaoyi Ma, Haejoong Lee, Creating Annotation Tools with the Annotation Graph Toolkit.

13:00-13:20

Jan-Torsten Milde, Ulrike Gut, The TASX-environment: an XML-based toolset for time aligned speech corpora.

13:20-13:40

Christophe Laprun, Jonathan G. Fiscus, John Garofolo, Sylvain Pajot, A Pratical Introduction to ATLAS.

 

WO19: MULTIWORD EXPRESSIONS & METAPHORS - Room La Graciosa

12:40-13:00

Nicoletta Calzolari, Charles J. Fillmore, Ralph Grishman, Nancy Ide, Alessandro Lenci, Catherine MacLeod, Antonio Zampolli, Towards Best Practice for Multiword Expressions in Computational Lexicons.

13:00-13:20

Ann Copestake, Fabre Lambeau, Aline Villavicencio, Francis Bond, Timothy Baldwin, Ivan A. Sag, Dan Flickinger, Multiword expressions: linguistic precision and reusability.

13:20-13:40

Antonietta Alonge, Margherita Castelli, Which way should we go? Metaphoric expressions in lexical resources.

 

WO20: MACHINE TRANSLATION - Room Atlantico

 

12:40-13:00

Taro Watanabe, Mitsuo Shimohata, Eiichiro Sumita, Statistical Machine Translation on Paraphrased Corpora.

13:00-13:20

Mathieu Lafourcade, Christian Boitet, UNL Lexical Selection with Conceptual Vectors.

13:20-13:40

Satoshi Shirai, Kazuhide Yamamoto, Francis Bond, Hozumi Tanaka, Towards a Thesaurus of Predicates.

 

WO21: TREEBANKS - Room Tenerife

 

12:40-13:00

Julia Hockenmaier, Mark Steedman, Acquiring Compact Lexicalized Grammars from a Cleaner Treebank.

13:00-13:20

Alexandra Kinyon, Carlos A. Prolo, Identifying Verb Arguments and their Syntactic Function in the Penn Treebank.

13:20-13:40

Paul Kingsbury, Martha Palmer, From TreeBank to PropBank.

 

WO22: COREFERENCE - Room Lanzarote

 

12:40-13:00

Cătălina Barbu, Richard Evans, Ruslan Mitkov, A corpus based investigation of morphological disagreement in anaphoric relations.

13:00-13:20

Dan Cristea, Oana-Diana Postolache, Gabriela-Eugenia Dima, Cătălina Barbu, AR-Engine - a framework for unrestricted co-reference resolution.

13:20-13:40

Daisuke Kawahara, Sadao Kurohashi, Kôiti Hasida, Construction of a Japanese Relevance-tagged Corpus.

 

13:40-15:00  LUNCH BREAK

 

15:00-17:00  PANEL P4 & SESSIONS IN PARALLEL:  SO9, WO23, WO24, TO1

 

PANEL P4 - Room Camara

 

15:00-17:20

Nicoletta Calzolari, Ralph Grishman, Marta Palmer, Standards & best practice for multilingual computational lexicons: ISLE MILE ... and more”.

 

SESSIONS SO9: EMOTIONAL & SPECIFIC DATABASES Room Atlantico

 

15:00-15:20

Parham Mokhtari, Nick Campbell, Automatic Detection of Acoustic Centres of Reliability for Tagging Paralinguistic Information in Expressive Speech.

15:20-15:40

Vladimir Hozjan, Zdravko Kacic, Objective analysis of emotional speech for English and Slovenian Interface emotional speech databases.

15:40-16:00

Vladimir Hozjan, Zdravko Kacic, Asunción Moreno, Antonio Bonafonte, Albino Nogueiras, Interface Databases: Design and Collection of a Multilingual Emotional Speech Database.

16:00-16:20

Nick Campbell, Recording techniques for capturing natural every-day speech.

16:20-16:40

Laura Pecchia, Giuseppe Cappelli, Elisabetta Guazzini, Linguistic and Computational Problems for the Creation of an Italian Children's Corpus of Spoken Language.

16:40-17:00

Hiromichi Kawanami, Tsuyoshi Masuda, Tomoki Toda, Kiyohiro Shikano, Designing speech database with prosodic variety for expressive TTS system.

17:00-17:20

Nobuo Kawaguchi, Shigeki Matsubara, Kazuya Takeda, Fumitada Itakura, Multi-Dimensional Data Acquisition for Integrated Acoustic Information Research.

 

SESSION WO23: CORPUS ANALYSIS, ANNOTATION, REPRESENTATION - Room La Graciosa

 

15:00-15:20

Irena Spasić, Goran Nenadić, Sophia Ananiadou, Tuning Context Features with Genetic Algorithms.


15:20-15:40

Steve Cassidy, XQuery as an Annotation Query Language: a Use Case Analysis.

15:40-16:00

Adán Cassán, Sergi Cervell, Mireia Colom, Rafael Marín, Josep M. Merenciano, Gema Pérez, Lluís Valentín, A step forward to hypertext.

16:00-16:20

Xiaoyi Ma, Haejoong Lee, Steven Bird, Kazuaki Maeda, Models and Tools for Collaborative Annotation.

16:20-16:40

Nigel Collier, Koichi Takeuchi, Chikashi Nobata, Junichi Fukumoto, Norihiro Ogata, Progress on Multi-lingual Named Entity Annotation Guidelines using RDF (S).

16:40-17:00

Brian Mitchell, Robert Gaizauskas, A Comparison of Machine Learning Algorithms for Prepositional Phrase Attachment.

17:00-17:20

R. Muñoz, R. Mitkov, M. Palomar, J. Peral, R. Evans, L. Moreno, Bilingual alignment of anaphoric expressions.

 

SESSION WO24: APPLICATIONS BASED ON WRITTEN LRs –

Room Tenerife

 

15:00-15:20

A. Cappelli, M. N. Catarsi, P. Michelassi, L. Moretti, M. Baglioni, F. Turini, M. Tavoni, Knowledge Mining and Discovery for Searching in Literary Texts.

15:20-15:40

Richard F. E. Sutcliffe, Kieran White, Searching via Keywords or Concept Hierarchies - Which is Better?.

15:40-16:00

Ganesh Ramesh, Amit Bagga, A Text-based for Detection and Filtering of Commercial Segments in Broadcast News.

16:00-16:20

Nadjet Bouayad-Agha, Richard Power, Donia Scott, Anja Belz, PILLS: Multilingual generation of medical information documents with overlapping content.

16:20-16:40

Masumi Narita, Kazuya Kurokawa, Takehito Utsuro, A Web-based English Abstract Writing Tool Using a Tagged E-J Parallel Corpus.

16:40-17:00

Jimmy Lin, The Web as a Resource for Question Answering: Perspectives and Challenges.

17:00-17:20

Philippe Langlais, Marie Loranger, Guy Lapalme, Translators at work with TRANSTYPE: Resource and Evaluation.

 

 

 

 

SESSION TO1: TERMINOLOGY - Room Lanzarote

 

15:00-15:20

Udo Hahn, Stefan Schulz, Towards Very Large Ontologies for Medical Language Processing.

15:20-15:40

Maria Rzewuska, Terminology Resources in the Context of a Major Translation Project.

15:40-16:00

Klaus-Dirk Schmitz, Subject-field-specific Ontologies and Terminologies for the Web Community.

16:00-16:20

Goran Nenadić, Irena Spasić, Sophia Ananiadou, Automatic Acronym Acquisition and Term Variation Management within Domain-Specific Texts.

16:20-16:40

Antonio S. Valderrábanos, Alexander Belskis, Luis Iraola Moreno, Multilingual Terminology Extraction and Validation.

16:40-17:00

Le An Ha, Learning description of term patterns using glossary resources.

 

17:20-18:00  CLOSING SESSION - Room Sinfonica

 

21:00  GALA DINNER – Santa Catalina Hotel