Title |
Re-using high-quality resources for continued evaluation of automated summarization systems |
Author(s) |
Laura Alonso i Alemany (1), Maria Fuentes (2), Marc Massot (3), Horacio Rodríguez (2) (1) GRIAL, Departament de Lingüística General, Universitat de Barcelona; (2) TALP Research Centre, Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya; (3) Departament d'Informàtica i Matemàtica Aplicada, Universitat de Girona |
Session |
O29-EMSW |
Abstract |
In this paper we present a method for re-using the human judgements on summary quality provided by the DUC contest. The score to be awarded to automatic summaries is calculated as a function of the scores assigned manually to the most similar summaries for the same document. This approach enhances the standard n-gram based evaluation of automatic summarization systems by establishing similarities between {\it extractive} (vs. {\it abstractive}) summaries and by taking advantage of the big quantity of evaluated summaries available from the DUC contest. The utility of this method is exemplified by the improvements achieved on a headline production system. |
Keyword(s) |
automatic summarization, automatic evaluation, evaluation of subjective tasks |
Language(s) | English |
Full Paper |