LREC 2000 2nd International Conference on Language Resources & Evaluation | |
Conference Papers
Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377. |
Previous Paper Next Paper
Title | Developing and Testing General Models of Spoken Dialogue System Peformance |
Authors |
Walker Marilyn (AT& T Labs - Research, 180 Park Ave, Florham Park, N.J. 07932, U.S.A., walker@research.att.com) Kamm Candace (AT& T Labs - Research, 180 Park Ave, Florham Park, N.J. 07932, U.S.A, cak@research.att.com) Boland Julie (AT& T Labs - Research, 180 Park Ave, Florham Park, N.J. 07932, U.S.A, boland@louisiana.edu) |
Keywords | |
Session | Session SO2 - Dialogue Evaluation Methods |
Abstract | The design of methods for performance evaluation is a major open research issue in the area of spoken language dialogue systems. This paper presents the PARADISE methodology for developing predictive models of spoken dialogue performance, and shows how to evaluate the predictive power and generalizability of such models. To illustrate the methodology, we develop a number of models for predicting system usability (as measured by user satisfaction), based on the application of PARADISE to experimental data from two different spoken dialogue systems. We compare both linear and tree-based models. We then measure the extent to which the models generalize across different systems, different experimental conditions, and different user populations, by testing models trained on a subset of the corpus against a test set of dialogues. The results show that the models generalize well across the two systems, and are thus a first approximation towards a general performance model of system usability. |