Summary of the paper

Title Data-driven Summarization of Scientific Articles
Authors Nikola Nikolov, Michael Pfeiffer, Richard Hahnloser
Abstract Data-driven approaches to sequence-to-sequence modelling have been successfully applied to short text summarization of news articles. Such models are typically trained on input-summary pairs consisting of only a single or a few sentences, partially due to limited availability of multi-sentence training data. Here, we propose to use scientific articles as a new milestone for text summarization: large-scale training data come almost for free with two types of high-quality summaries at different levels - the title and the abstract. We generate two novel multi-sentence summarization datasets from scientific articles and test the suitability of a wide range of existing extractive and abstractive neural network-based summarization approaches. Our analysis demonstrates that scientific papers are suitable for data-driven text summarization. Our results could serve as valuable benchmarks for scaling sequence-to-sequence models to very long sequences.
Full paper Data-driven Summarization of Scientific Articles
Bibtex @InProceedings{NIKOLOV18.2,
  author = {Nikola Nikolov ,Michael Pfeiffer and Richard Hahnloser},
  title = {Data-driven Summarization of Scientific Articles},
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {may},
  date = {7-12},
  location = {Miyazaki, Japan},
  editor = {},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {979-10-95546-20-7},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA