Title |
Segmentation Evaluation Metrics, a Comparison Grounded on Prosodic and Discourse Units |
Authors |
Klim Peshkov and Laurent Prévot |
Abstract |
Knowledge on evaluation metrics and best practices of using them have improved fast in the recent years Fort et al. (2012). However, the advances concern mostly evaluation of classification related tasks. Segmentation tasks have received less attention. Nevertheless, there are crucial in a large number of linguistic studies. A range of metrics is available (F-score on boundaries, F-score on units, WindowDiff ((WD), Boundary Similarity (BS) but it is still relatively difficult to interpret these metrics on various linguistic segmentation tasks, such as prosodic and discourse segmentation. In this paper, we consider real segmented datasets (introduced in Peshkov et al. (2012)) as references which we deteriorate in different ways (random addition of boundaries, random removal boundaries, near-miss errors introduction). This provide us with various measures on controlled datasets and with an interesting benchmark for various linguistic segmentation tasks. |
Topics |
Discourse Annotation, Representation and Processing, Prosody |
Full paper |
Segmentation Evaluation Metrics, a Comparison Grounded on Prosodic and Discourse Units |
Bibtex |
@InProceedings{PESHKOV14.931,
author = {Klim Peshkov and Laurent Prévot}, title = {Segmentation Evaluation Metrics, a Comparison Grounded on Prosodic and Discourse Units}, booktitle = {Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)}, year = {2014}, month = {may}, date = {26-31}, address = {Reykjavik, Iceland}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Hrafn Loftsson and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-8-4}, language = {english} } |