LREC 2000 2nd International Conference on Language Resources & Evaluation | ||||||
Title | SegWin: a Tool for Segmenting, Annotating, and Controlling the Creation of a Database of Spoken Italian Varieties |
Authors | Refice Mario (Department di Elettrotecnica ed Elettronica, Politecnico di Bari, Via Orabona, 4 - 70125 Bari - ITALY, refice@poliba.it) Savino Michelina (Department di Elettrotecnica ed Elettronica, Politecnico di Bari, Via Orabona, 4 - 70125 Bari - ITALY, esavino@poliba.it) Altieri Marco (Department di Elettrotecnica ed Elettronica, Politecnico di Bari, Via Orabona, 4 - 70125 Bari - ITALY) Altieri Roberto (Department di Elettrotecnica ed Elettronica, Politecnico di Bari, Via Orabona, 4 - 70125 Bari - ITALY) |
Keywords | Annotation Tools, Corpora, Databases, Segmentation Tools, Spoken Language Varieties |
Session | Session SP4 - Tools for Evaluation and Processing of Spoken Language Resources |
Full Paper | 310.ps, 310.pdf |
Abstract | A number of actions have been recently proposed, aiming at filling the gap existing in the availability of speech annotated corpora of Italian regional varieties. A starting action is represented by the national project AVIP (Archivio delle Varietà di Italiano Parlato, Spoken Italian Varieties Archive), whose main challenge is a methodological one, namely finding annotation strategies and developing suitable software tools for coping with the inadequacy of linguistic models for Italian accent variations. Basically, these strategies consist in adopting an iterative process of labelling such that a description for each variety could be achieved by successive refinement stages without loosing intermediate stages information. To satisfy such requirements, a specific software system, called SegWin, has been developed by Politecnico di Bari, which: • “guides” the human transcribers in the annotation phases by a sort of “scheduled procedure”; • allows incremental addition of information at any stage of the database creation; • monitors/checks the consistency of the database during every stage of its creation The system has been extensively used by all the partners of the project AVIP and is continuously updated to take into account the project needs. The main characteristics of SegWin are here described, in relation to the above mentioned aspects. |