Table of Contents
Fetching ...

BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion

Varun Iyer, Cornelia Caragea

Abstract

Abstractive summarization requires models to generate summaries that convey information in the source document. While large language models can generate summaries without fine-tuning, they often miss key details and include extraneous information. We propose BLooP (Bigram Lookahead Promotion), a simple training-free decoding intervention that encourages large language models (LLMs) to generate tokens that form bigrams from the source document. BLooP operates through a hash table lookup at each decoding step, requiring no training, fine-tuning, or model modification. We demonstrate improvements in ROUGE and BARTScore for Llama-3.1-8B-Instruct, Mistral-Nemo-Instruct-2407, and Gemma-2-9b-it on CNN/DM, CCSum, Multi-News, and SciTLDR. Human evaluation shows that BLooP significantly improves faithfulness without reducing readability. We make the code available at https://github.com/varuniyer/BLooP

BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion

Abstract

Abstractive summarization requires models to generate summaries that convey information in the source document. While large language models can generate summaries without fine-tuning, they often miss key details and include extraneous information. We propose BLooP (Bigram Lookahead Promotion), a simple training-free decoding intervention that encourages large language models (LLMs) to generate tokens that form bigrams from the source document. BLooP operates through a hash table lookup at each decoding step, requiring no training, fine-tuning, or model modification. We demonstrate improvements in ROUGE and BARTScore for Llama-3.1-8B-Instruct, Mistral-Nemo-Instruct-2407, and Gemma-2-9b-it on CNN/DM, CCSum, Multi-News, and SciTLDR. Human evaluation shows that BLooP significantly improves faithfulness without reducing readability. We make the code available at https://github.com/varuniyer/BLooP
Paper Structure (36 sections, 4 equations, 3 figures, 8 tables)

This paper contains 36 sections, 4 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: Increasing the beam width consistently improves Llama's performance on CNN/DM's validation split. In contrast, Gemma and Mistral stop improving once the beam width exceeds 4 and 5 respectively.
  • Figure 2: Part-of-speech tags of tokens in Llama-generated summaries of CCSum test set articles that differ because of BLooP.
  • Figure 3: Ablating $\alpha$ with Llama 3.1 8B Instruct on 10% of the CNN/DM validation set using a beam size of 8. fw-BLooP is frequency-weighted BLooP, where the promotion in Equation \ref{['eq:promotion']} is multiplied by the frequency of the bigram $(s_{t-1}, v)$ in the input document.