Table of Contents
Fetching ...

Selective Risk Certification for LLM Outputs via Information-Lift Statistics: PAC-Bayes, Robustness, and Skeleton Design

Sanjeda Akter, Ibne Farabi Shihab, Anuj Sharma

TL;DR

The paper introduces information-lift certificates for LLM outputs, leveraging heavy-tail aware sub-gamma PAC-Bayes bounds to provide anytime-valid, sequence-level uncertainty certification that compares model outputs to a skeleton baseline. By defining a clipped information lift and aggregating it across autoregressive sequences, the authors enable formal abstention decisions with finite-sample guarantees and a robust skeleton design via Variational Skeleton Design (VSD). Empirical results across eight diverse datasets show substantial gains in coverage at fixed risk (77.0% at 2% risk) and exceptional blocking of critical errors (96% on a biomedical challenge set) compared with entropy-based baselines, while maintaining practical runtime overhead and adapting to top-k API constraints. The approach emphasizes robustness to distributional shifts and skeleton misspecification, though it remains frequency-based rather than severity-aware, highlighting avenues for future work in harm-aware and severity-weighted certification.

Abstract

Large language models often produce confident but incorrect outputs, creating a critical need for reliable uncertainty quantification with formal abstention guarantees. We introduce information-lift certificates that compare model probabilities to a skeleton baseline, accumulating evidence through sub-gamma PAC-Bayes bounds that remain valid under heavy-tailed distributions where standard concentration inequalities fail. On eight diverse datasets, our method achieves 77.0\% coverage at 2\% risk, outperforming recent baselines by 10.0 percentage points on average. In high-stakes scenarios, we block 96\% of critical errors compared to 18-31\% for entropy-based methods. While our frequency-based certification does not guarantee severity-weighted safety and depends on skeleton quality, performance degrades gracefully under distributional shifts, making the approach practical for real-world deployment.

Selective Risk Certification for LLM Outputs via Information-Lift Statistics: PAC-Bayes, Robustness, and Skeleton Design

TL;DR

The paper introduces information-lift certificates for LLM outputs, leveraging heavy-tail aware sub-gamma PAC-Bayes bounds to provide anytime-valid, sequence-level uncertainty certification that compares model outputs to a skeleton baseline. By defining a clipped information lift and aggregating it across autoregressive sequences, the authors enable formal abstention decisions with finite-sample guarantees and a robust skeleton design via Variational Skeleton Design (VSD). Empirical results across eight diverse datasets show substantial gains in coverage at fixed risk (77.0% at 2% risk) and exceptional blocking of critical errors (96% on a biomedical challenge set) compared with entropy-based baselines, while maintaining practical runtime overhead and adapting to top-k API constraints. The approach emphasizes robustness to distributional shifts and skeleton misspecification, though it remains frequency-based rather than severity-aware, highlighting avenues for future work in harm-aware and severity-weighted certification.

Abstract

Large language models often produce confident but incorrect outputs, creating a critical need for reliable uncertainty quantification with formal abstention guarantees. We introduce information-lift certificates that compare model probabilities to a skeleton baseline, accumulating evidence through sub-gamma PAC-Bayes bounds that remain valid under heavy-tailed distributions where standard concentration inequalities fail. On eight diverse datasets, our method achieves 77.0\% coverage at 2\% risk, outperforming recent baselines by 10.0 percentage points on average. In high-stakes scenarios, we block 96\% of critical errors compared to 18-31\% for entropy-based methods. While our frequency-based certification does not guarantee severity-weighted safety and depends on skeleton quality, performance degrades gracefully under distributional shifts, making the approach practical for real-world deployment.

Paper Structure

This paper contains 75 sections, 5 theorems, 29 equations, 14 figures, 23 tables, 3 algorithms.

Key Result

Theorem 2.5

Let $\pi$ be a prior distribution over skeletons chosen before seeing calibration data, and $\rho$ be a posterior distribution over skeletons that may depend on calibration data. Under Assumption assump:subgamma, with probability at least $1-\delta$ over the calibration sample, In particular, for any fixed skeleton $S$ (taking $\rho=\delta_S$),

Figures (14)

  • Figure 1: Empirical coverage of 95% confidence bounds: standard Bernstein bound (dashed) fails to maintain valid coverage (drops below 90% in some regions) due to tail violations, while our sub-gamma bound maintains valid 95% coverage.
  • Figure 2: Risk-coverage performance comparison on NQ-Open dataset using GPT-4 with sequence-level gating. Target risk: 2%. Arrows indicate $\tau$-sweep direction (from high coverage/low risk to low coverage/high risk). Dotted lines show fixed target risks. Shaded regions show 95% confidence intervals.
  • Figure 3: Sub-gamma assumption validation and robustness analysis across all datasets and models. QQ-plots confirm sub-gamma fit quality with $R^2 > 0.85$ for 85% of dataset-model combinations. Hill indices range from 1.5 to 2.3, indicating power-law tails. See Figure \ref{['fig:complete_assumption_audits']} in the appendix for detailed multi-panel analysis.
  • Figure 4: Ablation studies showing parameter sensitivity analysis across different components of our method.
  • Figure 5: Main experimental results.
  • ...and 9 more figures

Theorems & Definitions (10)

  • Definition 2.1: Information lift statistic
  • Definition 2.2: Skeleton distribution
  • Remark 2.3: Sequence-level aggregation
  • Theorem 2.5: PAC-Bayes sub-gamma certificate
  • Theorem 2.6: $\eta$-robustness
  • Theorem 2.7: $\kappa$-informativeness lower bound
  • Proposition 2.8: Finite-sample selective risk
  • Theorem 2.9: Anytime-valid sequential bound
  • Remark C.1: On constants
  • Remark C.2