First International Conference
on Language Resources and Evaluation (LREC)
Granada, May 28-30 1998
SPEECH DATABASE DEVELOPMENT FOR CENTRAL AND EASTERN EUROPEAN LANGUAGES
Organised by the BABEL Project, Copernicus No.
1304
Wednesday, May 27th, 14.30 - 19.00
See Instructions for authors.
This workshop, which is held in conjunction with the First International
Conference on Language Resources and Evaluation in Granada, Spain, will
be concerned with the design, production and transcription standards required
for the construction of speech databases for languages of Central and Eastern
Europe.
Speech databases have been produced for a number of the world's major
languages, but most languages of Central and Eastern Europe have received
little attention in international terms until recently, though they are
of major importance for the future of European speech science. There are
special issues which arise in the production of representative samples
of these languages, and this workshop will attempt to address these issues.
The
BABEL project (funded by the European Union under the COPERNICUS programme,
project #1304) has been working on these issues since 1995, and will soon
complete a database of Bulgarian, Estonian, Hungarian, Polish and Romanian.
The work of the project will be reported at the workshop, and aspects of
the project will be the subject of practical demonstrations, but it is
hoped that papers will be contributed by other interested researchers who
are not associated with the project.
Information about BABEL can be read on its WWW
pages
Information about the main conference can be read on it's
WWW pages
Programme |
Time |
Author(s) |
Title & link to abstract |
Welcome |
14:30 |
Peter Roach |
Introduction |
Paper 1 |
14:40 |
Arvo Eek, Einar Meister |
Estonian
speech in the BABEL multilanguage database: phonetic-phonological
problems revealed in the text corpus |
Paper 2 |
15:00 |
SlawomirKula |
Telephone
bandwidth speech database: creation, applications and experiences for polish
language |
Paper 3 |
15:20 |
Henk van den Heuvel, Valery Galounov, Herbert S. Tropf |
The
SPEECHDAT(E) project: Creating speech dtabases for eastern European languages |
Open Forum 1 |
15:40 |
|
The nature of our data |
Paper 4 |
16:00 |
Klara Vicsi, A. Vig, G. Gordos |
Experience
on the development of a language independent automatic segmentation and
labeling system on the frame of the BABEL project |
Paper 5 |
16:20 |
Simon Dobrisek, Jerneja Gros, France Mihelic, Nikola Pavesic |
GOPOLIS: A Multi Speaker Solvenian Speech Database |
Coffee Break |
16:40 |
|
|
Paper 6 |
17:00 |
Toomas Altosaar, Matti Karjalainen, Martti Vainio, Einar Meister |
Finnish
and Estonian Speech Applications developed on an Object-Oriented Speech
Processing and Database System |
Open Forum 2 |
17:20 |
|
Labelling and annotation |
Paper 7 |
17:45 |
Marian Boldea, Cosmin Munteanu, Alin Doroga |
Design,
Collection, and Annotation of a Romanian Speech Database |
Paper 8 |
18:05 |
Tamas Varadi |
On
the Spoken Corpus of the Budapest Sociolinguistic Interview |
Paper 9 |
18:25 |
Zdravko Kacic, Janez Kaiser |
Development
of Slovenian SpeechDat database |
Open Forum 3 |
18:45 |
|
The Future |
CLOSE |
19:30 |
|
|
INSTRUCTIONS FOR AUTHORS
- Details of the required
format are available from the LREC web site.
- The deadline for submission of the completed paper is now April 14th.
- Submission should be via email or on floppy disk to
the contact address below.
- Papers should be submitted as a Microsoft Word for
Windows file (or other formats by arrangement with S.C.Arnfield@rdg.ac.uk).
ORGANISING COMMITTEE
- Peter Roach, University of Reading, UK (BABEL Project Coordinator)
- Klara Vicsi, Technical University, Budapest
- Lori Lamel, LIMSI, Paris
CONTACT PERSON
Peter Roach, Department of Linguistic Science, University of Reading,
Reading RG6 6AA, UK.
Tel: (+44) 118 931 8138 Fax: (+44) 118 9753365
email: p.j.roach@reading.ac.uk
WORKSHOP TOPICS
We hope that the following topics can be considered in the workshop;
this list is not exclusive, however.
- Recording techniques and standards
- Available software tools
- Annotation, transcription and labelling
- Automated time-alignment of labels
- Phonetic problems of specific languages of Central and Eastern Europe
- Quality control
- Requirements for larger-scale databases
- Dissemination of data; recording further languages; possibilities for
future collaboration.
THE WORKSHOP WILL CONCLUDE WITH A DISCUSSION OF THE POSSIBILITY
OF FORMING AN INFORMAL ASSOCIATION OF RESEARCHERS SPECIALISING IN THE SPOKEN
FORMS OF CENTRAL AND EASTERN EUROPEAN LANGUAGES.
Project Co-ordinator: Professor Peter Roach(P.J.Roach@reading.ac.uk)
Dr. Simon Arnfield(S.C.Arnfield@reading.ac.uk)