Table of Contents
Fetching ...

Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability

Weitong Zhang, Chengqi Zang, Bernhard Kainz

TL;DR

This work proposes a novel game-theoretic approach to enhance consistency and reliability during the decoding stage of LLM output generation, which models the decoding process as a multistage Bayesian decoding game.

Abstract

Large Language Models (LLMs) often produce outputs that -- though plausible -- can lack consistency and reliability, particularly in ambiguous or complex scenarios. Challenges arise from ensuring that outputs align with both factual correctness and human intent. This is problematic in existing approaches that trade improved consistency for lower accuracy. To mitigate these challenges, we propose a novel game-theoretic approach to enhance consistency and reliability during the decoding stage of LLM output generation. Our method models the decoding process as a multistage Bayesian decoding game. This ensures consistency through Correctness Alignment and enhances reliability via Ambiguity Calibration. The model dynamically converges to a consensus on the most reliable outputs and distinguishes {Valid, Specious} outputs without human feedback or additional training. Our game design allows smaller models to outperform much larger models through game mechanisms (e.g., 78.1 LLaMA13B vs 76.6 PaLM540B), as well as integrating various LL strategies and models, demonstrating the potential of game-theoretic tools to improve the truthfulness and reliability of LLMs.

Truth or Deceit? A Bayesian Decoding Game Enhances Consistency and Reliability

TL;DR

This work proposes a novel game-theoretic approach to enhance consistency and reliability during the decoding stage of LLM output generation, which models the decoding process as a multistage Bayesian decoding game.

Abstract

Large Language Models (LLMs) often produce outputs that -- though plausible -- can lack consistency and reliability, particularly in ambiguous or complex scenarios. Challenges arise from ensuring that outputs align with both factual correctness and human intent. This is problematic in existing approaches that trade improved consistency for lower accuracy. To mitigate these challenges, we propose a novel game-theoretic approach to enhance consistency and reliability during the decoding stage of LLM output generation. Our method models the decoding process as a multistage Bayesian decoding game. This ensures consistency through Correctness Alignment and enhances reliability via Ambiguity Calibration. The model dynamically converges to a consensus on the most reliable outputs and distinguishes {Valid, Specious} outputs without human feedback or additional training. Our game design allows smaller models to outperform much larger models through game mechanisms (e.g., 78.1 LLaMA13B vs 76.6 PaLM540B), as well as integrating various LL strategies and models, demonstrating the potential of game-theoretic tools to improve the truthfulness and reliability of LLMs.
Paper Structure (24 sections, 4 theorems, 20 equations, 4 figures, 5 tables)

This paper contains 24 sections, 4 theorems, 20 equations, 4 figures, 5 tables.

Key Result

Theorem 1.

More than one (mixed) strategy Perfect Bayesian Equilibrium exists for this game.

Figures (4)

  • Figure 1: Distinguishing Valid from Specious LM outputs, particularly when human evaluation may overlook plausible errors. The three panels demonstrate how models can generate both Valid (accurate and reliable) and Specious (plausible but misleading) responses.
  • Figure 2: A Bayesian Decoding Game ensures consistency through Correctness Alignment and enhances reliability via Ambiguity Calibration. The generation and verification are structured as a multi-stage signaling game, fostering a coherent consensus on the outputs {Correct, Incorrect} while improving the reliability {Valid, Specious}.
  • Figure 3: BDG's game design quickly reaches equilibrium and consensus between the generator and discriminator, typically within 100 epochs. In contrast, ECG requires significantly more epochs (3000 in this case) and exhibits continuous fluctuations (as shown in the lower right) before achieving consensus. (Zoom in for details.)
  • Figure 4: Entropy dynamics during convergence. (a.1) Fluctuations in BDG indicate exploration of multiple equilibria. (a.2) ECG shows persistent entropy fluctuations and continued exploration without reaching stabilization. BDG improves LLM consistency and reliability for human. (b.1) Impact of BDG and ECG on time, accuracy for human experts vs. non-experts. (b.2) BDG and ECG report on time, accuracy per case for human experts vs. non-experts. (Zoom in for details.)

Theorems & Definitions (13)

  • Definition 1.
  • Definition 2.
  • Theorem 1.
  • Definition 3.
  • Theorem 2.
  • Theorem 3.
  • Definition 4.
  • Definition 5.
  • Theorem 4.
  • Definition 6.
  • ...and 3 more