Table of Contents
Fetching ...

SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization

Yixin Liu, Pengfei Liu

TL;DR

SimCLS introduces a two-stage framework that separates generation and evaluation to directly optimize summary quality with a contrastive, reference-free evaluator. The generation stage uses a Seq2Seq model to create diverse candidates, while the evaluation stage trains a RoBERTa-based scorer with a ranking loss to align candidate scores with the source document. On CNN/DailyMail and XSum, SimCLS achieves state-of-the-art or near-state-of-the-art ROUGE and semantic scores, with more pronounced gains on CNNDM and supplementary insights from fine-grained analyses (entity- and sentence-level) and positional bias mitigation. The work highlights the potential to close the gap between training objectives and evaluation metrics in abstractive summarization and suggests broader applicability of the two-stage, contrastive paradigm.

Abstract

In this paper, we present a conceptually simple while empirically powerful framework for abstractive summarization, SimCLS, which can bridge the gap between the learning objective and evaluation metrics resulting from the currently dominated sequence-to-sequence learning framework by formulating text generation as a reference-free evaluation problem (i.e., quality estimation) assisted by contrastive learning. Experimental results show that, with minor modification over existing top-scoring systems, SimCLS can improve the performance of existing top-performing models by a large margin. Particularly, 2.51 absolute improvement against BART and 2.50 over PEGASUS w.r.t ROUGE-1 on the CNN/DailyMail dataset, driving the state-of-the-art performance to a new level. We have open-sourced our codes and results: https://github.com/yixinL7/SimCLS. Results of our proposed models have been deployed into ExplainaBoard platform, which allows researchers to understand our systems in a more fine-grained way.

SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization

TL;DR

SimCLS introduces a two-stage framework that separates generation and evaluation to directly optimize summary quality with a contrastive, reference-free evaluator. The generation stage uses a Seq2Seq model to create diverse candidates, while the evaluation stage trains a RoBERTa-based scorer with a ranking loss to align candidate scores with the source document. On CNN/DailyMail and XSum, SimCLS achieves state-of-the-art or near-state-of-the-art ROUGE and semantic scores, with more pronounced gains on CNNDM and supplementary insights from fine-grained analyses (entity- and sentence-level) and positional bias mitigation. The work highlights the potential to close the gap between training objectives and evaluation metrics in abstractive summarization and suggests broader applicability of the two-stage, contrastive paradigm.

Abstract

In this paper, we present a conceptually simple while empirically powerful framework for abstractive summarization, SimCLS, which can bridge the gap between the learning objective and evaluation metrics resulting from the currently dominated sequence-to-sequence learning framework by formulating text generation as a reference-free evaluation problem (i.e., quality estimation) assisted by contrastive learning. Experimental results show that, with minor modification over existing top-scoring systems, SimCLS can improve the performance of existing top-performing models by a large margin. Particularly, 2.51 absolute improvement against BART and 2.50 over PEGASUS w.r.t ROUGE-1 on the CNN/DailyMail dataset, driving the state-of-the-art performance to a new level. We have open-sourced our codes and results: https://github.com/yixinL7/SimCLS. Results of our proposed models have been deployed into ExplainaBoard platform, which allows researchers to understand our systems in a more fine-grained way.

Paper Structure

This paper contains 19 sections, 3 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: SimCLS framework for two-stage abstractive summarization, where $\text{Doc}$, $\text{S}$, $\text{Ref}$ represent the document, generated summary and reference respectively. At the first stage, a Seq2Seq generator (BART) is used to generate candidate summaries. At the second stage, a scoring model (RoBERTa) is used to predict the performance of the candidate summaries based on the source document. The scoring model is trained with contrastive learning, where the training examples are provided by the Seq2Seq model.
  • Figure 2: Test performance with different numbers of candidate summaries on CNNDM. Origin denotes the original performance of the baseline model.
  • Figure 3: Positional Bias. X-asis: the relative position of the matched sentence in source documents. Y-axis: the ratio of the matched sentences. For fair comparison, articles are first truncated to the generator's maximum input length. Origin denotes the original performance of the baseline model.