Table of Contents
Fetching ...

G2: Guided Generation for Enhanced Output Diversity in LLMs

Zhiwen Ruan, Yixia Li, Yefeng Liu, Yun Chen, Weihua Luo, Peng Li, Yang Liu, Guanhua Chen

TL;DR

The paper tackles the limited output diversity of LLMs by introducing Guide-to-Generation (G2), a training-free decoding framework that uses a base generator plus two contrastive prompts—Diversity and Dedupe Guides—along with a Center Selection Strategy and entropy-based gating to balance novelty and quality. By conditioning on representative prior responses and selectively intervening only when uncertainty is high, G2 achieves improved diversity across creative generation, instruction-following, translation, and summarization without sacrificing fidelity. Extensive experiments demonstrate that G2 attains near Pareto-optimal diversity-quality trade-offs and maintains practical efficiency, making it suitable for real-world deployment. The work highlights the effectiveness of dual-contrastive guidance and representative context conditioning for robust, task-agnostic diversity enhancement in LLMs.

Abstract

Large Language Models (LLMs) have demonstrated exceptional performance across diverse natural language processing tasks. However, these models exhibit a critical limitation in output diversity, often generating highly similar content across multiple attempts. This limitation significantly affects tasks requiring diverse outputs, from creative writing to reasoning. Existing solutions, like temperature scaling, enhance diversity by modifying probability distributions but compromise output quality. We propose Guide-to-Generation (G2), a training-free plug-and-play method that enhances output diversity while preserving generation quality. G2 employs a base generator alongside dual Guides, which guide the generation process through decoding-based interventions to encourage more diverse outputs conditioned on the original query. Comprehensive experiments demonstrate that G2 effectively improves output diversity while maintaining an optimal balance between diversity and quality.

G2: Guided Generation for Enhanced Output Diversity in LLMs

TL;DR

The paper tackles the limited output diversity of LLMs by introducing Guide-to-Generation (G2), a training-free decoding framework that uses a base generator plus two contrastive prompts—Diversity and Dedupe Guides—along with a Center Selection Strategy and entropy-based gating to balance novelty and quality. By conditioning on representative prior responses and selectively intervening only when uncertainty is high, G2 achieves improved diversity across creative generation, instruction-following, translation, and summarization without sacrificing fidelity. Extensive experiments demonstrate that G2 attains near Pareto-optimal diversity-quality trade-offs and maintains practical efficiency, making it suitable for real-world deployment. The work highlights the effectiveness of dual-contrastive guidance and representative context conditioning for robust, task-agnostic diversity enhancement in LLMs.

Abstract

Large Language Models (LLMs) have demonstrated exceptional performance across diverse natural language processing tasks. However, these models exhibit a critical limitation in output diversity, often generating highly similar content across multiple attempts. This limitation significantly affects tasks requiring diverse outputs, from creative writing to reasoning. Existing solutions, like temperature scaling, enhance diversity by modifying probability distributions but compromise output quality. We propose Guide-to-Generation (G2), a training-free plug-and-play method that enhances output diversity while preserving generation quality. G2 employs a base generator alongside dual Guides, which guide the generation process through decoding-based interventions to encourage more diverse outputs conditioned on the original query. Comprehensive experiments demonstrate that G2 effectively improves output diversity while maintaining an optimal balance between diversity and quality.

Paper Structure

This paper contains 51 sections, 5 equations, 14 figures, 7 tables.

Figures (14)

  • Figure 1: Comparison between standard decoding and our method G2. Standard decoding often produces repetitive outputs, with certain tokens dominating due to peaked softmax distributions. G2 leverages Diversity Guide and Dedupe Guide to encourage diverse and novel generations. See Algorithm \ref{['alg:main_pseudocode']} in the Appendix for details.
  • Figure 2: Diversity-quality curves on WMT’14 and XLSum between G2 and other baseline under different settings.
  • Figure 3: Ablation results of intervention strategy, center selection, and guide components on NoveltyBench.
  • Figure 4: Comparison of Pass@N accuracy on GSM8K between our method and the baseline.
  • Figure 5: Impact of the intervention strength hyperparameter $\theta$ on both quality and diversity on 100 samples from the WMT'14 Fr-En validation set.
  • ...and 9 more figures