Annotation plays an important roles in speech database. However annotation is time and annotators consuming. This paper proposes to provide phoneme-level labeling candidates with the state-of-the-art ASR models. The annotators could manually choose the appropriate labels and make final decision. Also a posterior probability evaluation method is applied to measure the annotation results. BLCU-SAIT speech corpus, a corpus aimed at computer aided pronunciation training (CAPT) is labeled with the annotation approach. Experimental results show that the mean consistency rate of manual labels is 87.2%. The posterior F1 score is 0.857. The annotation problems are converted from the open-ended questions to multiple-choice questions with the method. And the annotation results meet the requirements of CAPT systems.
@InProceedings{XIE18.10, author = {Yanlu Xie and Xin Wei}, title = {A Semi-manual Annotation Approach for Large CAPT Speech Corpus }, booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)}, year = {2018}, month = {may}, date = {7-12}, location = {Miyazaki, Japan}, editor = {Erhong Yang and Le Sun}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {979-10-95546-29-0}, language = {english} }