Table of Contents
Fetching ...

Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Miranda Muqing Miao, Michael Kearns

TL;DR

This work empirically investigates the Kalai–Vempala theory linking monofact rate, miscalibration, and hallucination in language models by using controlled n-gram experiments and synthetic SFT data. It demonstrates that data drawn from heavy-tailed Pareto distributions lowers monofact rates and reduces hallucination, and that deliberate miscalibration via selective upweighting can further reduce hallucination without sacrificing overall accuracy. An empirical KL divergence bound is proposed as a practical analogue to the population miscalibration term, enabling real-data guidance without access to the true distribution. The findings challenge the notion that deduplication is universally beneficial and highlight data-centric levers for reducing factual errors in LLM outputs, while acknowledging limitations in generalization and fairness.

Abstract

Hallucinated facts in large language models (LLMs) have recently been shown to obey a statistical lower bound determined by the monofact rate (related to the classical Good-Turing missing mass estimator) minus model miscalibration (Kalai & Vempala, 2024). We present the first empirical investigation of this three-way relationship in classical n-gram models and fine-tuned encoder-decoder Transformers. By generating training data from Pareto distributions with varying shape parameters, we systematically control the monofact rates and establish its positive relationship with hallucination. To bridge theory and practice, we derive an empirical analog of the hallucination bound by replacing the population miscalibration term (Section 2.1) with an empirical bin-wise KL divergence and confirm its practical viability. We then introduce selective upweighting -- a simple yet effective technique that strategically repeats as little as 5% of training examples -- to deliberately inject miscalibration into the model. This intervention reduces hallucination by up to 40%, challenging universal deduplication policies. Our experiments reveal a critical trade-off: selective upweighting maintains pre-injection levels of accuracy while substantially reducing hallucination, whereas standard training gradually improves accuracy but fails to address persistently high hallucination, indicating an inherent tension in optimization objectives.

Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

TL;DR

This work empirically investigates the Kalai–Vempala theory linking monofact rate, miscalibration, and hallucination in language models by using controlled n-gram experiments and synthetic SFT data. It demonstrates that data drawn from heavy-tailed Pareto distributions lowers monofact rates and reduces hallucination, and that deliberate miscalibration via selective upweighting can further reduce hallucination without sacrificing overall accuracy. An empirical KL divergence bound is proposed as a practical analogue to the population miscalibration term, enabling real-data guidance without access to the true distribution. The findings challenge the notion that deduplication is universally beneficial and highlight data-centric levers for reducing factual errors in LLM outputs, while acknowledging limitations in generalization and fairness.

Abstract

Hallucinated facts in large language models (LLMs) have recently been shown to obey a statistical lower bound determined by the monofact rate (related to the classical Good-Turing missing mass estimator) minus model miscalibration (Kalai & Vempala, 2024). We present the first empirical investigation of this three-way relationship in classical n-gram models and fine-tuned encoder-decoder Transformers. By generating training data from Pareto distributions with varying shape parameters, we systematically control the monofact rates and establish its positive relationship with hallucination. To bridge theory and practice, we derive an empirical analog of the hallucination bound by replacing the population miscalibration term (Section 2.1) with an empirical bin-wise KL divergence and confirm its practical viability. We then introduce selective upweighting -- a simple yet effective technique that strategically repeats as little as 5% of training examples -- to deliberately inject miscalibration into the model. This intervention reduces hallucination by up to 40%, challenging universal deduplication policies. Our experiments reveal a critical trade-off: selective upweighting maintains pre-injection levels of accuracy while substantially reducing hallucination, whereas standard training gradually improves accuracy but fails to address persistently high hallucination, indicating an inherent tension in optimization objectives.

Paper Structure

This paper contains 32 sections, 3 theorems, 16 equations, 15 figures, 2 algorithms.

Key Result

Theorem 1

Figures (15)

  • Figure 1: Illustration of fact repetition frequencies across varying parameters ($\gamma$). Plot shows how often a fact appears (x-axis) versus how many facts have that appearance count (y-axis, logarithmic scale).
  • Figure 2: Each dot represents a sample of 5,000 statements. Left: Results show a positive relationship between monofact rate and hallucination. Middle Left: Heavy-tailed document distributions yield lower monofact rates. Middle Right: Miscalibration increases with monofact rate, suggesting better learning and calibration in low-monofact distributions. Right: Strong positive correlation between empirical KL divergence and miscalibration metrics.
  • Figure 3: Top: Average miscalibration per probability bin (binning created by a logarithmic binning strategy with $\epsilon$ = 0.1) for three monofact percent ranges. Bottom: Average empirical KL divergence per probability bin for three monofact percent ranges.
  • Figure 4: Top: Relationship between miscalibration (blue line, left y-axis) and hallucination rates (red line, right y-axis) for select fixed monofact rates. Dotted lines indicate metrics prior to any intervetions. Each subplot shows how miscalibration and hallucination evolve as we duplicate token occurence for more and more statements from the training data (size of 5,000). Bottom: Relationship between empirical KL Divergence (green line, left y-axis) and hallucination rates (red line, right y-axis) for select fixed monofact rates.
  • Figure 5: The data creation pipeline for the synthetic biography data that we utilize for supervised fine-tuning. Hallucination and inaccuracy are measured respectively through free-generation attribute hallucination and forced-generation attribute inaccuracy for all observed biographies.
  • ...and 10 more figures

Theorems & Definitions (4)

  • Theorem 1: Kalai and Vempala's Hallucination Lower Bound
  • Theorem 2: Empirical KL‑Divergence Hallucination Bound
  • Theorem 3: Empirical KL–Divergence Hallucination Bound
  • proof