Classify or Select: Neural Architectures for Extractive Document Summarization
Ramesh Nallapati, Bowen Zhou, Mingbo Ma
TL;DR
The paper develops two RNN-based architectures, Classifier and Selector, for extractive single-document summarization, both using a hierarchical encoding and a score-based mechanism over abstract features (salience, novelty, content, redundancy, position). It evaluates shallow and deep variants on Daily Mail and DUC 2002, showing deep Classifier models achieve strong performance and often surpass baselines, with Selector offering benefits in less-structured settings. A novel abstractive training approach links SummaRuNNer with an RNN decoder to train from abstractive references, while extensive qualitative analyses highlight interpretability via feature visualization and learned weights. The work also discusses domain transfer implications and suggests directions for applying the Selector to more unstructured tasks and for incorporating beam search in future work.
Abstract
We present two novel and contrasting Recurrent Neural Network (RNN) based architectures for extractive summarization of documents. The Classifier based architecture sequentially accepts or rejects each sentence in the original document order for its membership in the final summary. The Selector architecture, on the other hand, is free to pick one sentence at a time in any arbitrary order to piece together the summary. Our models under both architectures jointly capture the notions of salience and redundancy of sentences. In addition, these models have the advantage of being very interpretable, since they allow visualization of their predictions broken up by abstract features such as information content, salience and redundancy. We show that our models reach or outperform state-of-the-art supervised models on two different corpora. We also recommend the conditions under which one architecture is superior to the other based on experimental evidence.
