LREC 2000 2nd International Conference on Language Resources &Evaluation

Introductory Messages

Message of the Chairman of the Local Organising Committee:
Professor George Carayannis

Introduction of the Conference Chairman:
Antonio Zampolli

Message from ELRA's CEO:
Khalid Choukri




Researches for the Millenium
Catherime Macleod

Human Language Technology Resources for Central European Languages: European Integration Issues
Zygmunt Vetulani

Multilingual Content Encoding and Translation
Antonio Sanfilippo

Session WO1 - Corpus Tagging

Developing Guidelines and Ensuring Consistency for Chinese Text Annotation
Xia Fei, Palmer Martha, Xue Nianwen, Okurowski Mary Ellen, Kovarik John, Chiou Fu-Dong, Huang Shizhe, Kroch Tony, Marcus Mitch

Using Machine Learning Methods to Improve Quality of Tagged Corpora and Learning Models
Matsumoto Yuji, Yamashita Tatsuo

Bootstrapping a Tagged Corpus through Combination of Existing Heterogeneous Taggers
Zavrel Jakub, Daelemans Walter

Something Borrowed, Something Blue: Rule-based Combination of POS Taggers
Borin Lars 

Session EO1 - Evaluation of Machine Translation

Determining the Tolerance of Text-handling Tasks for MT Output
White John, Doyon Jennifer, Talbott Susan

Evaluating Translation Quality as Input to Product Development
Bohan Niamh, Breidt Elisabeth, Volk Martin

An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research
Nießen Sonja, Och Franz Josef, Leusch Gregor, Ney Hermann 

Session SO1 - Data Centers / Major Projects

Issues in Corpus Creation and Distribution: The Evolution of the Linguistic Data Consortium
Cieri Christopher, Liberman Mark

The Establishment of Motorola's Human Language Data Resource Center: Addressing the Criticality of Language Resources in the Industrial Setting
Talley Jim

A Platform for Dutch in Human Language Technologies
D'Halleweyn Elisabeth, Dewallef Erwin, Beeken Jeannine

Recent Developments within the European Language Resources Association (ELRA)
Choukri Khalid, Mance Audrey, Mapelli Valerie

COCOSDA - a Progress Report
Campbell Nick

Survey of Language Engineering Needs: a Language Resources Perspective
Allen Jeffrey, Choukri Khalid 

Session WO2 - Treebanks

Building a Treebank for French
Abeille Anne, Clement Lionel, Kinyon Alexandra

Semantico-syntactic Tagging of Very Large Corpora: the Case of Restoration of Nodes on the Underlying Level
Hajicova Eva, Sgall Petr

Building a Treebank for Italian: a Data-driven Annotation Schema
Bosco Cristina, Lombardo Vincenzo, Vassallo Daniela, Lesmo Leonardo

A Treebank of Spanish and its Application to Parsing
Moreno Antonio, Grishman Ralph, Lopez Susana, Sanchez Fernando, Sekine Satoshi

Shallow Parsing and Functional Structure in Italian Corpora
Delmonte Rodolfo

An XML-based Representation Format for Syntactically Annotated Corpora
Mengel Andreas, Lezius Wolfgang 

Session WO3 - Corpus Categorisation

Modern Greek Corpus Taxonomy
Mikros George , Carayannis George

Automatic Style Categorisation of Corpora in the Greek Language
Tambouratzis George, Markantonatou Stella, Hairetakis Nikolaos, Carayannis George

TyPTex: Inductive Typological Text Classification by Multivariate Statistical Analysis for NLP Systems Tuning/Evaluation
Folch Helka, Heiden Serge,  Habert Benoit, Fleury Serge, Illouz Gabriel, Lafon Pierre , Nioche Julien, Prevost Sophie 

Session WO4 - Reusability Issues

Language Resources as by-Product of Evaluation: The MULTITAG Example
Paroubek Patrick

Enabling Resource Sharing in Language Generation: an Abstract Reference Architecture
Cahill Lynne, Doran Christy, Evans Roger , Kibble Rodger, Mellish Chris, Paiva D., Reape Mike, Scott Donia,  Tipper Neil

Experiences of Language Engineering Algorithm Reuse
Gamback Bjorn, Olsson Fredrik 

Session SO2 - Dialogue Evaluation Methods

Dialogue and Prompting Strategies Evaluation in the DEMON System
Lavelle Carine-Alexia, De Calmes Martine, Perennou Guy

Predictive Performance of Dialog Systems
Bonneau-Maynard H., Devillers L. , Rosset S.

A Methodology for Evaluating Spoken Language Dialogue Systems and Their Components
Bernsen Niels Ole, Dybkjær Laila

Developing and Testing General Models of Spoken Dialogue System Peformance
Walker Marilyn, Kamm Candace, Boland Julie  

Session WO5 - Corpus Tools

A Framework for Cross-Document Annotation
Day David, Goldschen Alan,  Henderson John

Providing Internet Access to Portuguese Corpora: the AC/DC Project
Santos Diana, Bick Eckhard

Annotating a Corpus to Develop and Evaluate Discourse Entity Realization Algorithms: Issues and Preliminary Results
Poesio Massimo

Using Few Clues Can Compensate the Small Amount of Resources Available for Word Sense Disambiguation
de Loupy Claude, El-Beze Marc 

Session WO6 - Acquisition of Lexical Information

Learning Verb Subcategorization from Corpora: Counting Frame Subsets
Zeman Daniel, Sarkar Anoop

Tuning Lexicons to New Operational Scenarios
Basili Roberto, Pazienza Maria Teresa, Vindigni Michele, Zanzotto Fabio Massimo

A Flexible Infrastructure for Large Monolingual Corpora
Quasthoff Uwe, Wolff Christian

Automatic Generation of Dictionary Definitions from a Computational Lexicon
Labropoulou Penny, Mantzari Elena, Papageorgiou Harris, Gavrilidou Maria 

Session SP1 - Phonetic Issues and Speech Synthesis

MHATLex: Lexical Resources for Modelling the French Pronunciation
Perennou Guy, De Calmes Martine

PLEDIT - A New Efficient Tool for Management of Multilingual Pronunciation Lexica and Batchlists
Vlaj Damjan, Kaiser Janez, Wilhelm Ralph, Ziegenhain Ute

Object-oriented Access to the Estonian Phonetic Database
Meister Einar, Eek Arvo, Altosaar Toomas, Vainio Martti

A French Phonetic Lexicon with Variants for Speech and Language Processing
de Mareuil Philippe Boula, d'Alessandro Christophe, Yvon Francois, Auberge Veronique, Vaissiere Jacqueline, Amelot Angelique

A Computational Platform for Development of Morphologic and Phonetic Lexica
Rojc Matej, Kacic Zdravko

An Optimised FS Pronunciation Resource Generator for Highly Inflecting Languages
Gibbon Dafydd, Quirino Simoes Ana Paula, Matthiesen Martin

Design Methodology for Bilingual Pronunciation Dictionary
Kim Jong-mi

Labeling of Prosodic Events in Slovenian Speech Database GOPOLIS
Mihelic France, Gros Jerneja, Noth Elmar, Warnke Volker

Regional Pronunciation Variants for Automatic Segmentation
Beringer Nicole, Neff Marcia

Le Programme Compalex (COMPAraison LEXicale)
Ndamba Josue, Bayamboussa Jean Silence

Perceptual Evaluation of Text-to-Speech Implementation of Enclitic Stress in Greek
Fotinea Stavroula-Evita, Protopapas Athanassios, Dimitriadis Dimitris, Carayannis George

Etude et Evaluation de la Di-Syllabe comme Unite Acoustique pour le Systeme de Synthese Arabe PARADIS
Chenfour N., Benabbou A., Mouradi A.

Design of Optimal Slovenian Speech Corpus for Use in the Concatenative Speech Synthesis System
Rojc Matej, Kacic Zdravko 

Session WP1 - Lexicon

The Bank of Swedish
Gellerstam Martin, Cederholm Yvonne, Rasmark Torgny

The Multi-layer Language Knowledge Base of Chinese NLP
Junfeng Hu, Shiwen Yu

Producing LRs in Parallel with Lexicographic Description: the DCC project
Soler i Bou Joan

Some Language Resources and Tools for Computational Processing of Portuguese at INESC
Wittmann Luzia, Ribeiro Ricardo Daniel, Pego Tania, Batista Fernando

Screffva: A Lexicographer's Workbench
Mills Jon

The Concede Model for Lexical Databases
Erjavec Tomaz, Evans Roger, Ide Nancy, Kilgarriff Adam

Automatically Expansion of Thesaurus Entries with a Different Thesaurus
Kashioka Hideki, Shirai Satosi

Electronic Language Resources for Polish: POLEX, CEGLEX and GRAMLEX
Vetulani Zygmunt

Turkish Electronic Living Lexicon (TELL): A Lexical Database
Inkelas Sharon, Kuntay Aylin, Orgun C. Orhan, Sprouse Ronald

Tools for the Generation of Morphological Entries in Dictionaries
Viks Ulle

Design and Construction of Knowledge base for Verb using MRD and Tagged Corpus
Chae Young-Soog, Choi Key-Sun 

Session SP2 - Spoken Language Resources Issues from Construction to Validation

Recruitment Techniques for Minority Language Speech Databases: Some Observations
Jones Rhys James, Mason John S., Helliker Louise, Pawlewski Mark

Enhancing Speech Corpus Resources with Multiple Lexical Tag Layers
Witt Andreas, Lungen Harald, Gibbon Dafydd

What are Transcription Errors and Why are They made?
Oppermann Daniela, Burger Susanne , Weilhammer Karl

Quality Control in Large Annotation Projects Involving Multiple Judges: The Case of the TDT Corpora
Strassel Stephanie , Graff David, Martey Nii , Cieri Christopher

A New Methodology for Speech Corpora Definition from Internet Documents
Vaufreydaz D., Bergamini C., Serignat J.F., Besacier L., Akbar M.

Many Uses, Many Annotations for Large Speech Corpora: Switchboard and TDT as Case Studies
Graff David, Bird Steven

SLR Validation: Present State of Affairs and Prospects
van den Heuvel Henk, Boves Lou, Choukri Khalid, Goddijn Simo, Sanders Eric

On the Usage of Kappa to Evaluate Agreement on Coding Tasks
Di Eugenio Barbara 

Session WP2 - Corpus Annotation

A Word-level Morphosyntactic Analyzer for Basque
Aduriz I., Agirre E., Aldezabal I., Arregi X., Arriola J. M., Artola X. , Gojenola K., Maritxalar A., Sarasola K., Urkia M.

Interactive Corpus Annotation
Brants Thorsten , Plaehn Oliver

Semi-automatic Construction of a Tree-annotated Corpus Using an Iterative Learning Statistical Language Model
Shirai Kiyoaki, Tanaka Hozumi,  Tokunaga Takenobu

A Robust Parser for Unrestricted Greek Text
Boutsis Sotiris, Prokopidis Prokopis, Giouli Voula, Piperidis Stelios

Automatic Assignment of Grammatical Relations
Lesmo Leonardo, Lombardo Vincenzo

Resources for Lexicalized Tree Adjoining Grammars and XML Encoding: TagML
Bonhomme Patrice, Lopez Patrice

CLinkA A Coreferential Links Annotator
Orasan Constantin

Coreference in Annotating a Large Corpus
Hajicova Eva, Panenova Jarmila,  Sgall Petr

FAST - Towards a Semi-automatic Annotation of Corpora
Barbu Catalina

Layout Annotation in a Corpus of Patient Information Leaflets
Bouayad-Agha Nadjet 

Session WP3 - Multilingual Corpora

Designing a Tool for Exploiting Bilingual Comparable Corpora
Bennison Peter, Bowker Lynne

A Word Sense Disambiguation Method Using Bilingual Corpus
Jie Zheng, Yuhang Mao

Building the Croatian-English Parallel Corpus
Tadic Marko

A Parallel Corpus of Italian/German Legal Texts
Gamper Johann

Lexical and Translation Equivalence in Parallel Corpora
Varadi Tamas

Some Technical Aspects about Aligning Near Languages
de Yzaguirre Lluis, Ribas Marta, Vivaldi Jordi, Cabre M. Teresa

Cairo: An Alignment Visualization Tool
Smith Noah A., Jahr Michael E. 


Keynotes Speeches

Next Generation Natural Language Applications
Salim Roukos

Terminology Standards - Help for the Terminology Community
Alan K. Melby, Klaus-Dirk Schmitz


International Co-operation in the field of Language Resources and Evaluation
Antonio Zampolli, Lynette Hirschman


Session SO3 - Speech Synthesis

GREEK ToBI: A System for the Annotation of Greek Speech Corpora
Arvaniti Amalia, Baltazani Mary

EULER: an Open, Generic, Multilingual and Multi-platform Text-to-Speech System
Dutoit Thierry, Bagein Michel, Malfrere Fabrice, Pagel Vincent,  Ruelle Alain, Tounsi Nawfal, Wynsberghe Dominique

POSCAT: A Morpheme-based Speech Corpus Annotation Tool
Kim Byeongchang, Cha Jeongwon, Lee Geunbae, Lee Jin-seok  

Session WO7 - Syntantic Parsing

A Strategy for the Syntactic Parsing of Corpora: from Constraint Grammar Output to Unification-based Processing
Badia Toni, Egea Angels

Learning Preference of Dependency between Japanese Subordinate Clauses and its Evaluation in Parsing
Utsuro Takehito

An Open Source Grammar Development Environment and Broad-coverage English Grammar Using HPSG
Copestake Ann, Flickinger Dan 

Session WO8 - Acquisition of Semantic Information

Controlled Bootstrapping of Lexico-semantic Classes as a Bridge between Paradigmatic and Syntagmatic Knowledge: Methodology and Evaluation
Allegrini Paolo, Montemagni Simonetta , Pirrelli Vito

Automatic Extraction of Semantic Similarity of Words from Raw Technical Texts
Thanopoulos Aristomenis, Fakotakis Nikos, Kokkinakis George

Abstraction of the EDR Concept Classification and its Effectiveness in Word Sense Disambiguation
Kazuhiro Kimura, Hideki Hirakawa 

Session EO2 - Evaluation of Tools

Where Opposites Meet. A Syntactic Meta-scheme for Corpus Annotation and Parsing Evaluation
Lenci Alessandro, Montemagni Simonetta, Pirrelli Vito, Soria Claudia

A Comparison of Summarization Methods Based on Task-based Evaluation
Hajime Mochizuki, Manabu Okumura

Evaluation of TRANSTYPE, a Computer-aided Translation Typing System: A Comparison of a Theoretical- and a User-oriented Evaluation Procedures
Langlais Philippe, Sauve Sebastien, Foster George, Macklovitch Elliott, Lapalme Guy 

Session SO4 - Speech Synthesis Evaluation

The Cost258 Signal Generation Test Array
Bailly Gerard, Banga Eduardo R., Monaghan Alex, Rank Erhard

Guidelines for Japanese Speech Synthesizer Evaluation
Itahashi Shuichi

Perception and Analysis of a Reiterant Speech Paradigm: a Functional Diagnostic of Synthetic Prosody
Rilliard Albert , Auberge Veronique 

Session WO9 - Applications in the Written Area

Looking for Errors: A Declarative Formalism for Resource-adaptive Language Checking
Bredenkamp Andrew, Crysmann Berthold, Petrea Mirela

An Architecture for Document Routing in Spanish: Two Language Components, Pre-processor and Parser
Rojo Guillermo, Alvarez Maria Concepcion, Alvarino Pilar, Gil Adelaida , Santalla Maria Paula, Sotelo Susana

Extraction of Unknown Words Using the Probability of Accepting the Kanji Character Sequence as One Word
Shinnou Hiroyuki, Ikeya Masanori 

Session WO10 - Semantic Annotation of Corpora

An Experiment of Lexical-Semantic Tagging of an Italian Corpus
Corazzari Ornella, Calzolari Nicoletta, Zampolli Antonio

Semantic Tagging for the Penn Treebank
Palmer Martha, Trang Dang Hoa,  Rosenzweig Joseph

A Step toward Semantic Indexing of an Encyclopedic Corpus
Alcouffe Philippe, Gacon Nicolas, Roux Claude , Segond Frederique 

Session SO5 - Evaluation of Dialogue

Obtaining Predictive Results with an Objective Evaluation of Spoken Dialogue Systems: Experiments with the DCR Assessment Paradigm
Antoine Jean-Yves, Siroux Jacques , Caelen Jean, Villaneau Jeanne, Goulian Jerome,  Ahafhaf Mohamed

Lessons Learned from a Task-based Evaluation of Speech-to-Speech Machine Translation
Levin Lori, Bartlog Boris, Font Llitjos Ariadna, Gates Donna,  Lavie Alon, Wallace Dorcas, Watanabe Taro, Woszczyna Monika

Galaxy-II as an Architecture for Spoken Dialogue Evaluation
Polifroni Joseph, Seneff Stephanie

Issues in the Evaluation of Spoken Dialogue Systems - Experience from the ACCeSS Project
Brey Thomas, Hanrieder Gerhard, Heisterkamp Paul, Hitzenberger Ludwig, Regel-Brietzmann Peter

Evaluation for Darpa Communicator Spoken Dialogue Systems
Walker Marilyn, Hirschman Lynette , Aberdeen John

Evaluation of a Dialogue System Based on a Generic Model that Combines Robust Speech Understanding and Mixed-initiative Control
Diaz Verdejo J.E., Lopez-Cozar R. , Rubio A.J., De la Torre A. 

Session WO11 - Mono-Multilingual Lexicon Acquisition and Building

Automatic Extraction of English-Chinese Term Lexicons from Noisy Bilingual Corpora
Le Sun,  Youbing Jin, Lin Du,  Yufang Sun

Chinese-English Semantic Resource Construction
Dorr Bonnie J., Levow Gina-Anne, Lin Dekang, Thomas Scott

Towards A Universal Tool For NLP Resource Acquisition
Sheremetyeva Svetlana, Nirenburg Sergei

Acquisition of Linguistic Patterns for Knowledge-based Information Extraction
Harabagiu Sanda M., Maiorano Steven J.

Using Lexical Semantic Knowledge from Machine Readable Dictionaries for Domain Independent Language Modelling
Demetriou George, Atwell Eric,  Souter Clive

ItalWordNet: a Large Semantic Database for Italian
Roventini Adriana, Alonge Antonietta , Calzolari Nicoletta, Magnini Bernardo, Bertagna Francesca 

Session WO12 - Language Resources: Infrastructural Issues

An Open Architecture for the Construction and Administration of Corpora
Orasan Constantin, Krishnamurthy Ramesh

Corpus Resources and Minority Language Engineering
McEnery Tony, Baker Paul, Burnard Lou

Towards a Query Language for Annotation Graphs
Bird Steven, Buneman Peter,  Tan Wang-Chiew

Software Infrastructure for Language Resources: a Taxonomy of Previous Work and a Requirements Analysis
Cunnigham Hamish, Bontcheva Kalina, Tablan Valentin, Wilks Yorick

XCES: An XML-based Encoding Standard for Linguistic Corpora
Ide Nancy, Bonhomme Patrice, Romary Laurent

The American National Corpus: A Standardized Resource for American English
Macleod Catherine, Ide Nancy, Grishman Ralph  

Session TO1 - Terminology

Accessibility of Multilingual Terminological Resources - Current Problems and Prospects for the Future
Budin Gerhard, Melby Alan K.

Terminology in Korea: KORTERM
Choi Key-Sun, Chae Young-Soog

ARC A3: A Method for Evaluating Term Extracting Tools and/or Semantic Relations between Terms from Corpora
Jouis Christophe, ARC A3

Use of Greek and Latin Forms for Term Detection
Estopa Rosa, Vivaldi Jordi, Cabre M. Teresa

Automatically Augmenting Terminological Lexicons from Untagged Text
Demetriou George, Gaizauskas Robert

Creating and Using Domain-specific Ontologies for Terminological Applications
Maynard Diana, Ananiadou Sophia 

Session SP3 - Spoken Language Resources' Projects

SALA: SpeechDat across Latin America. Results of the First Phase
Moreno Asuncion, Comeyne Robrecht, Haslam Keith, van den Heuvel Henk, Hoge Harald, Horbach Sabine , Micca Giorgio

SPEECON - Speech Data for Consumer Devices
Siemund Rainer, Hoge Harald,  Kunzmann Siegfried, Marasek Krzysztof

The Spoken Dutch Corpus. Overview and First Evaluation
Oostdijk Nelleke

SPEECHDAT-CAR. A Large Speech Database for Automotive Environments
Moreno Asuncion, Lindberg Borge, Draxler Christoph, Richard Gael,  Choukri Khalid, Euler Stephan, Allen Jeffrey

Creation of Spoken Hebrew Databases
Rannon Tami, Golani Ofra, Goren Anat, Shammass Sherrie,  Moyal Ami

Spoken Portuguese: Geographic and Social Varieties
Bettencourt Goncalves Jose, Veloso Rita

Orthographic Transcription of the Spoken Dutch Corpus
Goedertier Wim, Goddijn Simo, Martens Jean-Pierre

Development of Acoustic and Linguistic Resources for Research and Evaluation in Interactive Vocal Information Servers
Bernardis Giulia, Bourlard Herve , Rajman Martin, Chappelier Jean-Cedric

Development and Evaluation of an Italian Broadcast News Corpus
Federico Marcello, Giordani Dimitri , Coletti Paolo

Large, Multilingual, Broadcast News Corpora for Cooperative Research in Topic Detection and Tracking: The TDT-2 and TDT-3 Corpus Efforts
Cieri Christopher, Graff David, Liberman Mark , Martey Nii, Strassel Stephanie

Live Lexicons and Dynamic Corpora Adapted to the Network Resources for Chinese Spoken Language Processing Applications in an Internet Era
Chien Lee-Feng, Lee Lin-Shan

Shallow Discourse Genre Annotation in CallHome Spanish
Ries Klaus, Levin Lori, Levin Lori, Valle Liza, Lavie Alon, Waibel Alex

Issues in Design and Collection of Large Telephone Speech Corpus for Slovenian Language
Kacic Zdravko, Horvat Bogomir, Zogling Aleksandra

Spontaneous Speech Corpus of Japanese
Maekawa Kikuo, Koiso Hanae,  Furui Sadaoki, Isahara Hitoshi

Corpora of Slovene Spoken Language for Multi-lingual Applications
Gros Jerneja, Mihelic France, Dobrisek Simon, Erjavec Tomaz,  Zganec Mario

The ISLE Corpus of Non-Native Spoken English
Menzel Wolfgang, Atwell Eric, Bonaventura Patrizia, Herron Daniel, Howarth Peter,  Morton Rachel, Souter Clive

Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition
Nakamura Satoshi, Hiyane Kazuo,  Asano Futoshi, Nishiura Takanobu, Yamada Takeshi

The Influence of Scenario Constraints on the Spontaneity of Speech. A Comparison of Dialogue Corpora
Weilhammer Karl , Oppermann Daniela, Burger Susanne

Developing a Multilingual Telephone Based Information System in African Languages
Roux J.C., Botha E.C., du Preez J.A.  

Session WP4 - Lexicon: Semantic and Multilingual Issues

Extraction of Concepts and Multilingual Information Schemes from French and English Economics Documents
Cadel Peggy , Ledouble Helene

Application of WordNet ILR in Czech Word-formation
Klimova Jana, Pala Karel

Coping with Lexical Gaps when Building Aligned Multilingual Wordnets
Bentivogli Luisa, Pianta Emanuele, Pianesi Fabio

Extension and Use of GermaNet, a Lexical-Semantic Database
Kunze Claudia

CDB - A Database of Lexical Collocations
Krenn Brigitte

Towards a Strategy for a Representation of Collocations - Extending the Danish PAROLE-lexicon
Braasch Anna, Olsen Sussi

Improving Lexical Databases with Collocational Information: Data from Portuguese
Guerreiro Paula

A Bilingual Electronic Dictionary for Frame Semantics
Fontenelle Thierry

A Text->Meaning->Text Dictionary and Process
Dutoit Dominique

Production of NLP-oriented Bilingual Language Resources from Human-oriented dictionaries
Fluhr-Semenova Vera , Fluhr Christian, Brisson Stephanie 

Session TP1 - Terminology

Terms Specification and Extraction within a Linguistic-based Intranet Service
Pedrazzini Sandro, Maier Elisabeth, Konig Dierk

With WORLDTREK Family, Create, Update and Browse your Terminological World
Abbas Yasmina, Picard Marie-Luce

Extraction of Semantic Clusters for Terminological Information Retrieval from MRDs
Sierra Gerardo, McNaught John

Reusing the Mikrokosmos Ontology for Concept-based Multilingual Terminology Databases
Moreno Antonio, Perez Chantal

Term-based Identification of Sentences for Text Summarisation
Georgantopoulos Byron, Piperidis Stelios

Terminology Encoding in View of Multifunctional NLP Resources
Katsoyannou Marianna, Efthimiou Eleni

ARISTA Generative Lexicon for Compound Greek Medical Terms
Kontos John, Malagardi Ioanna, Fountoukis Spyros 

Session WP5 - Corpus Tagging

Hua Yu: A Word-segmented and Part-Of-Speech Tagged Chinese Corpus
Maosong Sun, Honglin Sun, Changning Huang , Pu Zhang, Hongbing Xing , Qiang Zhou

Morphological Tagging to Resolve Morphological Ambiguities
Birocheau Gaelle

Morphemic Analysis and Morphological Tagging of Latvian Corpus
Levane Kristine, Spektors Andrejs

Morphosyntactic Tagging of Slovene: Evaluating Taggers and Tagsets
Dzeroski Saso, Erjavec Tomaz, Zavrel Jakub

Using a Large Set of EAGLES-compliant Morpho-syntactic Descriptors as a Tagset for Probabilistic Tagging
Tufis Dan

The Context (not only) for Humans
Hladka Barbora

PoS Disambiguation and Partial Parsing Bidirectional Interaction
Felipe Montserrat Marimon, Porta Zamorano Jordi

Rule-based Tagging: Morphological Tagset versus Tagset of Analytical Functions
Ribarov Kiril 

Session WP6 - Tools in the Written Area

The New Edition of the Natural Language Software Registry (an Initiative of ACL hosted at DFKI)
Declerck Thierry , Werner Jachmann Alexander,  Uszkoreit Hans

Open Ended Computerized Overview of Controlled Languages
Gavieiro-Villatte Elisa , Spaggiari Laurent

Automatic Transliteration and Back-transliteration by Decision Tree Learning
Kang Byung-Ju, Choi Key-Sun

The Universal XML Organizer: UXO
Milde Jan-Torsten, Reinsch Markus

LT TTT - A Flexible Tokenisation Tool
Grover Claire, Matheson Colin,  Mikheev Andrei, Moens Marc

Will Very Large Corpora Play For Semantic Disambiguation The Role That Massive Computing Power Is Playing For Other AI-Hard Problems?
Cucchiarelli Alessandro , Faggioli Enrico, Velardi Paola

Interarbora and Thistle - Delivering Linguistic Structure by the Internet
Calder Jo

A Proposal for the Integration of NLP Tools using SGML-Tagged Documents
Artola X., de Ilarraza A. Diaz , Ezeiza N., Gojenola K.,  Maritxalar A., Soroa A.

Reusability as Easy Adaptability: A Substantial Advance in NL Technology
Prodanof Irina, Cappelli Amedeo, Moretti Lorenzo 


Keynote Speeches

Meeting Recognition and Tracking
Alex Waibel

The Evolution of an NLP System
Stephen D.Richardson


Speech Database Processing Tools - the state of the art in automatic labeling of speech
Nick Campbell

Session WO13 - Multilingual Resources and Applications

Grammarless Bracketing in an Aligned Bilingual Corpus
Kinoshita Jorge

Constructing a Tagged E-J Parallel Corpus for Assisting Japanese Software Engineers in Writing English Abstracts
Narita Masumi

Multilingual Linguistic Resources: From Monolingual Lexicons to Bilingual Interrelated Lexicons
Villegas Marta, Bel Nuria, Lenci Alessandro , Calzolari Nicoletta, Ruimy Nilda, Zampolli Antonio,  Sadurni Teresa, Soler i Bou Joan

TransSearch: A Free Translation Memory on the World Wide Web
Macklovitch Elliott, Simard Michel , Langlais Philippe 

Session WO14 - Named Entity Recognition

Annotating Resources for Information Extraction
Boisen Sean, Crystal Michael R., Schwartz Richard, Stone Rebecca, Weischedel Ralph

Integrating Seed Names and ngrams for a Named Entity List and Classifier
Buchholz Sabine, van den Bosch Antal

Named Entity Recognition in Greek Texts
Demiros Iason, Boutsis Sotiris,  Giouli Voula, Liakata Maria, Papageorgiou Harris, Piperidis Stelios

Minimally Supervised Japanese Named Entity Recognition: Resources and Evaluation
Utsuro Takehito, Sassano Manabu 

Session EO3 - Evaluation and Semantics

English Senseval: Report and Results
Kilgarriff Adam, Rosenzweig Joseph

Evaluation of a Generic Lexical Semantic Resource in Information Extraction
Yue Chai Joyce

Sublanguage Dependent Evaluation: Toward Predicting NLP performances
Illouz Gabriel

Evaluation of Word Alignment Systems
Ahrenberg Lars, Merkel Magnus,  Sagvall Hein Anna , Tiedemann Jorg 

Session WO15 - Language Resources Projects

Language Resources Development at the Spanish Royal Academy
Municio Angel Martin, Rojo Guillermo , Sanchez Leon Fernando,  Pinillos Octavio

A Self-Expanding Corpus Based on Newspapers on the Web
Hofland Knut

For a Repository of NLP Tools
Chaudiron Stephane, Choukri Khalid, Mance Audrey, Mapelli Valerie  

Session WO16 - Corpus Annotation and Information Extraction

Coreference Annotation: Whither?
Kibble Rodger , van Deemter Kees

Annotating Events and Temporal Information in Newswire Texts
Setzer Andrea, Gaizauskas Robert

A Semi-automatic System for Conceptual Annotation, its Application to Resource Construction and Evaluation
Black W.J., McNaught John,  Zarri G.P., Persidis A.,  Brasher A., Gilardoni L., Bertino E., Semeraro G., Leo P. 

Session EO4 - Grammars and Systems Evaluation

Using a Formal Approach to Evaluate Grammars
Gargouri Bilel, Jmaiel Mohamed, Hamadou Abdelmajid Ben

Towards More Comprehensive Evaluation in Anaphora Resolution
Mitkov Ruslan

Coreference Resolution Evaluation Based on Descriptive Specificity
Trouilleux Francois , Gaussier Eric, Bes Gabriel G., Zaenen Annie 

Session SO6 - Recognition

Methods and Metrics for the Evaluation of Dictation Systems: a Case Study
Canelli Maria, Grasso Daniele, King Margaret

Design Issues in Text-Independent Speaker Recognition Evaluation
Martin Alvin, Przybocki Mark

Perceptual Evaluation of a New Subband Low Bit Rate Speech Compression System based on Waveform Vector Quantization and SVD Postfiltering
Fotinea Stavroula-Evita, Dologlou Ioannis, Bakamidis Stylianos, Stainhaouer Gregory, Carayannis George

IPA Japanese Dictation Free Software Project
Shikano Kiyohiro, Kawahara Tatsuya , Takeda Kasuya, Yamada Atsushi, Itou Akinori,  Itou Katsunobu, Utsuro Takehito, Kobayashi Tetsunori, Minematsu Nobuaki, Yamamoto Mikio,  Sagayama Shigeki, Lee Akinobu

The COST 249 SpeechDat Multilingual Reference Recogniser
Johansen Finn Tore, Warakagoda Narada, Lindberg Borge, Lehtinen Gunnar,  Kacic Zdravko, Zgank Andreh, Elenius Kjell, Salvi Gampiero

Automotive Speech-Recognition - Success Conditions Beyond Recognition Rates
Bengler Klaus

Evaluating Multi-party Multi-modal Systems
Damianos Laurie E., Drury Jill,  Fanderclai Tari, Hirschman Lynette, Oshika Beatrice 

Session WO17 - Semantic Lexicons

What's in a Thesaurus?
Kilgarriff Adam , Yallop Colin

SIMPLE: A General Framework for the Development of Multilingual Lexicons
Bel Nuria, Busa Federica, Calzolari Nicoletta, Gola Elisabetta, Lenci Alessandro, Monachini Monica, Ogonowski Antoine, Peters Ivonne, Peters Wim , Ruimy Nilda, Villegas Marta, Zampolli Antonio

The Treatment of Adjectives in SIMPLE: Theoretical Observations
Peters Ivonne, Peters Wim

Lexicalised Systematic Polysemy in WordNet
Peters Ivonne, Peters Wim

Annotating, Disambiguating &Automatically Extending the Coverage of the Swedish SIMPLE Lexicon
Kokkinakis Dimitrios , Toporowska Gronostaj Maria,  Warmenius Karin

Semantic Encoding of Danish Verbs in SIMPLE - Adapting a Verb Framed Model to a Satellite-framed Language
Sandford Pedersen Bolette, Nimb Sanni

Integrating Subject Field Codes into WordNet
Magnini Bernardo, Cavaglia Gabriela 

Session WO18 - Morphology in Lexical and Textual Resources

Principled Hidden Tagset Design for Tiered Tagging of Hungarian
Tufis Dan, Dienes Peter, Oravecz Csaba , Varadi Tamas

Part of Speech Tagging and Lemmatisation for the Spoken Dutch Corpus
Van Eynde Frank, Zavrel Jakub, Daelemans Walter

Inter-annotator Agreement for a German Newspaper Corpus
Brants Thorsten

An Approach to Lexical Development for Inflectional Languages
Turcato Davide, Toole Janine, Tsiplakou Stavroula, Heift Trude,  McFetridge Paul

GeDeriF: Automatic Generation and Analysis of Morphologically Constructed Lexical Resources
Namer Fiammetta, Dal Georgette

A Unified POS Tagging Architecture and its Application to Greek
Papageorgiou Harris, Prokopidis Prokopis, Giouli Voula, Piperidis Stelios

Derivation in the Czech National Corpus
Klimova Jana, Kocek Jan 

Session EO5 - Information Retrieval and Question Answering Evaluation

The Evaluation of Systems for Cross-language Information Retrieval
Braschler Martin, Harman Donna, Hess Michael , Kluck Michael, Peters Carol, Schauble Peter

IREX: IR &IE Evaluation Project in Japanese
Sekine Satoshi, Isahara Hitoshi

Textual Information Retrieval Systems Test: The Point of View of an Organizer and Corpuses Provider
Kremer Patrick , Schmitt Laurent

Multilingual Topic Detection and Tracking: Successful Research Enabled by Corpora and Evaluation
Wayne Charles L.

How to Evaluate Your Question Answering System Every Day ... and Still Get Real Work Done
Breck Eric J., Burger John D., Ferro Lisa , Hirschman Lynette, House David, Light Marc, Mani Inderjeet

The TREC-8 Question Answering Track
Voorhees Ellen M., Tice Dawn M.

Cardinal, Nominal or Ordinal Similarity Measures in Comparative Evaluation of Information Retrieval Process
Michel Christine 

Session SP4 - Tools for Evaluation and Processing of Spoken Language Resources

Transcribing with Annotation Graphs
Geoffrois Edouard, Barras Claude,  Bird Steven, Wu Zhibiao

SpeechDat-Car Fixed Platform
Fonollosa Jose A.R. , Moreno Asuncion

Automatic Speech Segmentation in High Noise Condition
Ivanov Rosen

SegWin: a Tool for Segmenting, Annotating, and Controlling the Creation of a Database of Spoken Italian Varieties
Refice Mario, Savino Michelina,  Altieri Marco, Altieri Roberto

A Graphical Parametric Language-Independent Tool for the Annotation of Speech Corpora
Georgila Kallirroi, Fakotakis Nikos, Kokkinakis George

NaniTrans: a Speech Labelling Tool
Portabella David, Febrer Albert,  Moreno Asuncion

Annotation of a Multichannel Noisy Speech Corpus
Cristoforetti L., Matassoni M., Omologo M., Svaizer P., Zovato E.

Dialogue Annotation for Language Systems Evaluation
Charfuelan Marcela, Relano Gil Jose , Rogriguez Gancedo M. Carmen,  Tapias Merino Daniel, Gomez Luis Hernandez

Annotating Communication Problems Using the MATE Workbench
Dybkjær Laila, Moller Morten Baun , Bernsen Niels Ole, Grosse Michael, Olsen Martin,  Schiffrin Amanda

The MATE Workbench Annotation Tool, a Technical Description
Isard Amy, McKelvie David, Mengel Andreas, Moller Morten Baun

On the Use of Prosody for On-line Evaluation of Spoken Dialogue Systems
Swerts Marc, Krahmer Emiel

MDWOZ: A Wizard of Oz Environment for Dialog Systems Development
Munteanu Cosmin, Boldea Marian

End-to-End Evaluation of Machine Interpretation Systems: A Graphical Evaluation Tool
Jekat Susanne J., Tessiore Lorenzo

Cross-lingual Interpolation of Speech Recognition Models
Micca Giorgio, Frasca Alessandra , Di Benedetto Maria Gabriella 

Session WP7 - Corpus Projects

Rarity of Words in a Language and in a Corpus
Hlavacova Jaroslava

The PAROLE Program
Vignaux Georges

Portuguese Corpora at CLUL
Bacelar do Nascimento Maria Fernanda, Pereira Luisa,  Saramago Joao

Russian Monitor Corpora: Composition, Linguistic Encoding and Internet Publication
Yablonsky Serge A.

A Web-based Text Corpora Development System
Bohus Dan, Boldea Marian

Issues from Corpus Analysis that have influenced the On-going Development of Various Haitian Creole Text- and Speech-based NLP Systems and Applications
Mason Marilyn 

Session EP1 - Evaluation and Written Area

Enhancing the TDT Tracking Evaluation
Bagga Amit

Target Suites for Evaluating the Coverage of Text Generators
Bateman John A., Hartley Anthony F.

A Novelty-based Evaluation Method for Information Retrieval
Fujii Atsushi, Ishikawa Tetsuya

How To Evaluate and Compare Tagsets? A Proposal
Dejean Herve

Evaluating summary for Multiple Documents in an Interactive Environment
Stein Gees C., Strzalkowski Tomek, Wise G. Bowden, Bagga Amit

Establishing the Upper Bound and Inter-judge Agreement of a Verb Classification Task
Merlo Paola, Stevenson Suzanne

A Parallel English-Japanese Query Collection for the Evaluation of On-Line Help Systems
Sutcliffe Richard F.E., Kurohashi Sadao

An HPSG-Annotated Test Suite for Polish
Marciniak Malgorzata, Mykowiecka Agnieszka, Kupsc Anna,  Przepiorkowski Adam

Evaluation of Computational Linguistic Techniques for Identifying Significant Topics for Browsing Applications
Klavans Judith L., Wacholder Nina,  Evans David K. 

Session SP5 - Multimodal - Multimedia Resources and Tools

The EUDICO Project, Multi Media Annotation over the Internet
Russel Albert, Brugman Hennie, Broeder Daan , Wittenburg Peter

Towards a Standard for Meta-descriptions of Language Resources
Broeder Daan, Brugman Hennie, Russel Albert , Skiba R., Wittenburg Peter

ATLAS: A Flexible and Extensible Architecture for Linguistic Annotation
Bird Steven, Day David, Garofolo John , Henderson John, Laprun Christophe, Liberman Mark

Models of Russian Text/Speech Interactive Databases for Supporting of Scientific, Practical and Cultural Researches
Skrelin Pavel, Sherstinova Tatiana

A Multi-view Hyperlexicon Resource for Speech and Language System Development
Gibbon Dafydd, Trippel Thorsten

Addizionario: an Interactive Hypermedia Tool for Language Learning
Turrini Giovanna, Cignoni Laura, Paccosi Alessandro 

Session WP8 - Corpus Tools

A Web-based Advanced and User Friendly System: The Oslo Corpus of Tagged Norwegian Texts
Johannessen Janne Bondi, Noklestad Anders,  Hagen Kristin

Introduction of KIBS (Korean Information Base System) Project
Chae Young-Soog, Choi Key-Sun

Design and Implementation of the Online ILSP Greek Corpus
Hatzigeorgiu Nick, Gavrilidou Maria , Piperidis Stelios, Carayannis George, Papakostopoulou Anastasia, Spiliotopoulou Athanassia,  Vacalopoulou Anna, Labropoulou Penny, Mantzari Elena, Papageorgiou Harris, Demiros Iason

The (Un)Deterministic Nature of Morphological Context
Ribarov Kiril

A Software Toolkit for Sharing and Accessing Corpora Over the Internet
Luz Saturnino

GRUHD: A Greek database of Unconstrained Handwriting
Kavallieratou E., Liolios N., Koutsogeorgos E., Fakotakis Nikos, Kokkinakis George 

Session WP9 - Applications using Written Language Resources

Resources for Multilingual Text Generation in Three Slavic Languages
Bateman John A., Teich Elke, Kruijff Geert-Jan , Kruijff-Korbayova Ivanna,  Sharoff Serge, Skoumalova Hana

Evaluating Wordnets in Cross-language Information Retrieval: the ITEM Search Engine
Verdejo Felisa, Gonzalo Julio, Penas Anselmo , Lopez Fernando, Fernandez David

NL-Translex: Machine Translation for Dutch
Cucchiarini Catia, Van Hoorde Johan,  D'Halleweyn Elisabeth

Typographical and Orthographical Spelling Error Correction
Min Kyongho, Wilson William H., Moon Yoo-Jin

LEXIPLOIGISSI: An Educational Platform for the Teaching of Terminology in Greece
Economou Constandina, Raptis Spyros, Stainhaouer Gregory

Collocations as Word Co-ocurrence Restriction Data - An Application to Japanese Word Processor -
Shudo Kosho, Takahashi Masahito, Koyama Yasuo, Yoshimura Kenji