LREC 2000 2nd International Conference on Language Resources & Evaluation
 

Previous Paper   Next Paper

Title Term-based Identification of Sentences for Text Summarisation
Authors Georgantopoulos Byron (Institute for Language and Speech Processing Epidavrou & Artemidos 6, 151 25 Maroussi, Greece email: byron@ilsp.gr)
Piperidis Stelios (Institute for Language and Speech Processing, Artemidos 6 & Epidavrou, 151 25, Athens, Greece, tel: +301 6875300, fax: +301 6854270, spip@ilsp.gr)
Keywords Automatic Term Extraction, Sentence Extraction, Statistical NLP, Terminological Resources, Text Summarisation
Session Session TP1 - Terminology
Full Paper 106.ps, 106.pdf
Abstract The present paper describes a methodology for automatic text summarisation of Greek texts which combines terminology extraction and sentence spotting. Since generating abstracts has proven a hard NLP task of questionable effectiveness, the paper focuses on the production of a special kind of abstracts, called extracts: sets of sentences taken from the original text. These sentences are selected on the basis of the amount of information they carry about the subject content. The proposed, corpus-based and statistical approach exploits several heuristics to determine the summary-worthiness of sentences. It actually uses statistical occurrences of terms (TF· IDF formula) and several cue phrases to calculate sentence weights and then extract the top scoring sentences which form the extract.