Summary of the paper

Title A Semi-manual Annotation Approach for Large CAPT Speech Corpus
Authors Yanlu Xie and Xin Wei
Abstract Annotation plays an important roles in speech database. However annotation is time and annotators consuming. This paper proposes to provide phoneme-level labeling candidates with the state-of-the-art ASR models. The annotators could manually choose the appropriate labels and make final decision. Also a posterior probability evaluation method is applied to measure the annotation results. BLCU-SAIT speech corpus, a corpus aimed at computer aided pronunciation training (CAPT) is labeled with the annotation approach. Experimental results show that the mean consistency rate of manual labels is 87.2%. The posterior F1 score is 0.857. The annotation problems are converted from the open-ended questions to multiple-choice questions with the method. And the annotation results meet the requirements of CAPT systems.
Topics Manual Label, Annotation, F1 Score
Full paper A Semi-manual Annotation Approach for Large CAPT Speech Corpus
Bibtex @InProceedings{XIE18.10,
  author = {Yanlu Xie and Xin Wei},
  title = {A Semi-manual Annotation Approach for Large CAPT Speech Corpus },
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {may},
  date = {7-12},
  location = {Miyazaki, Japan},
  editor = {Erhong Yang and Le Sun},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {979-10-95546-29-0},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA