OrderSum: Semantic Sentence Ordering for Extractive Summarization
Taewan Kwon, Sangyong Lee
TL;DR
OrderSum addresses the underexplored problem of sentence order in extractive summarization by embedding candidate summaries in a semantic space that encodes sentence order. It combines sentence extraction with a summary-level triplet ranking objective that integrates $ROUGE$ signals, including $ROUGE-L_{full}$, and employs anchor candidate sampling to manage training cost. Empirically, OrderSum yields state-of-the-art $ROUGE-L$ on CNN/DailyMail (30.52, up to +2.54) and shows strong performance on XSum, WikiHow, and PubMed, while qualitative analyses confirm improved sentence order over prior methods. The work demonstrates the practical impact of optimizing at the summary level for both inclusion and ordering, and highlights avenues for future work with longer summaries and abstractive extensions.
Abstract
There are two main approaches to recent extractive summarization: the sentence-level framework, which selects sentences to include in a summary individually, and the summary-level framework, which generates multiple candidate summaries and ranks them. Previous work in both frameworks has primarily focused on improving which sentences in a document should be included in the summary. However, the sentence order of extractive summaries, which is critical for the quality of a summary, remains underexplored. In this paper, we introduce OrderSum, a novel extractive summarization model that semantically orders sentences within an extractive summary. OrderSum proposes a new representation method to incorporate the sentence order into the embedding of the extractive summary, and an objective function to train the model to identify which extractive summary has a better sentence order in the semantic space. Extensive experimental results demonstrate that OrderSum obtains state-of-the-art performance in both sentence inclusion and sentence order for extractive summarization. In particular, OrderSum achieves a ROUGE-L score of 30.52 on CNN/DailyMail, outperforming the previous state-of-the-art model by a large margin of 2.54.
