BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion

Varun Iyer; Cornelia Caragea

BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion

Varun Iyer, Cornelia Caragea

Abstract

Abstractive summarization requires models to generate summaries that convey information in the source document. While large language models can generate summaries without fine-tuning, they often miss key details and include extraneous information. We propose BLooP (Bigram Lookahead Promotion), a simple training-free decoding intervention that encourages large language models (LLMs) to generate tokens that form bigrams from the source document. BLooP operates through a hash table lookup at each decoding step, requiring no training, fine-tuning, or model modification. We demonstrate improvements in ROUGE and BARTScore for Llama-3.1-8B-Instruct, Mistral-Nemo-Instruct-2407, and Gemma-2-9b-it on CNN/DM, CCSum, Multi-News, and SciTLDR. Human evaluation shows that BLooP significantly improves faithfulness without reducing readability. We make the code available at https://github.com/varuniyer/BLooP

BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion

Abstract

Paper Structure (36 sections, 4 equations, 3 figures, 8 tables)

This paper contains 36 sections, 4 equations, 3 figures, 8 tables.

Introduction
Related Work
Methodology
Task Formulation
Proposed Approach: BLooP
Experimental Setup
Datasets
CNN/DM
Multi-News
CCSum
SciTLDR
Baselines and Evaluation Metrics
PEGASUS
TED
WikiTransfer
...and 21 more sections

Figures (3)

Figure 1: Increasing the beam width consistently improves Llama's performance on CNN/DM's validation split. In contrast, Gemma and Mistral stop improving once the beam width exceeds 4 and 5 respectively.
Figure 2: Part-of-speech tags of tokens in Llama-generated summaries of CCSum test set articles that differ because of BLooP.
Figure 3: Ablating $\alpha$ with Llama 3.1 8B Instruct on 10% of the CNN/DM validation set using a beam size of 8. fw-BLooP is frequency-weighted BLooP, where the promotion in Equation \ref{['eq:promotion']} is multiplied by the frequency of the bigram $(s_{t-1}, v)$ in the input document.

BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion

Abstract

BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion

Authors

Abstract

Table of Contents

Figures (3)