SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization
Yixin Liu, Pengfei Liu
TL;DR
SimCLS introduces a two-stage framework that separates generation and evaluation to directly optimize summary quality with a contrastive, reference-free evaluator. The generation stage uses a Seq2Seq model to create diverse candidates, while the evaluation stage trains a RoBERTa-based scorer with a ranking loss to align candidate scores with the source document. On CNN/DailyMail and XSum, SimCLS achieves state-of-the-art or near-state-of-the-art ROUGE and semantic scores, with more pronounced gains on CNNDM and supplementary insights from fine-grained analyses (entity- and sentence-level) and positional bias mitigation. The work highlights the potential to close the gap between training objectives and evaluation metrics in abstractive summarization and suggests broader applicability of the two-stage, contrastive paradigm.
Abstract
In this paper, we present a conceptually simple while empirically powerful framework for abstractive summarization, SimCLS, which can bridge the gap between the learning objective and evaluation metrics resulting from the currently dominated sequence-to-sequence learning framework by formulating text generation as a reference-free evaluation problem (i.e., quality estimation) assisted by contrastive learning. Experimental results show that, with minor modification over existing top-scoring systems, SimCLS can improve the performance of existing top-performing models by a large margin. Particularly, 2.51 absolute improvement against BART and 2.50 over PEGASUS w.r.t ROUGE-1 on the CNN/DailyMail dataset, driving the state-of-the-art performance to a new level. We have open-sourced our codes and results: https://github.com/yixinL7/SimCLS. Results of our proposed models have been deployed into ExplainaBoard platform, which allows researchers to understand our systems in a more fine-grained way.
