Table of Contents
Fetching ...

Variational Uncertainty Decomposition for In-Context Learning

I. Shavindra Jayasekera, Jacob Si, Filippo Valdettaro, Wenlong Chen, A. Aldo Faisal, Yingzhen Li

TL;DR

The paper tackles uncertainty in in-context learning by treating ICL as an implicit Bayesian process and introducing a Variational Uncertainty Decomposition (VUD) framework. VUD constructs variational upper bounds on aleatoric uncertainty using optimisable auxiliary queries Z, which simultaneously yields a lower bound on epistemic uncertainty, without requiring explicit posterior samples of the latent parameter θ. The method leverages permutation-based ensembling and KL-filtering to promote exchangeability and demonstrates consistent, interpretable decompositions across synthetic benchmarks and real-world tasks, including bandit settings and QA abstention. Results show that epistemic uncertainty guides exploration and that aleatoric uncertainty is informative for abstention and OOD detection, highlighting the practical value of uncertainty decomposition in LLM-based ICL.

Abstract

As large language models (LLMs) gain popularity in conducting prediction tasks in-context, understanding the sources of uncertainty in in-context learning becomes essential to ensuring reliability. The recent hypothesis of in-context learning performing predictive Bayesian inference opens the avenue for Bayesian uncertainty estimation, particularly for decomposing uncertainty into epistemic uncertainty due to lack of in-context data and aleatoric uncertainty inherent in the in-context prediction task. However, the decomposition idea remains under-explored due to the intractability of the latent parameter posterior from the underlying Bayesian model. In this work, we introduce a variational uncertainty decomposition framework for in-context learning without explicitly sampling from the latent parameter posterior, by optimising auxiliary queries as probes to obtain an upper bound to the aleatoric uncertainty of an LLM's in-context learning procedure, which also induces a lower bound to the epistemic uncertainty. Through experiments on synthetic and real-world tasks, we show quantitatively and qualitatively that the decomposed uncertainties obtained from our method exhibit desirable properties of epistemic and aleatoric uncertainty.

Variational Uncertainty Decomposition for In-Context Learning

TL;DR

The paper tackles uncertainty in in-context learning by treating ICL as an implicit Bayesian process and introducing a Variational Uncertainty Decomposition (VUD) framework. VUD constructs variational upper bounds on aleatoric uncertainty using optimisable auxiliary queries Z, which simultaneously yields a lower bound on epistemic uncertainty, without requiring explicit posterior samples of the latent parameter θ. The method leverages permutation-based ensembling and KL-filtering to promote exchangeability and demonstrates consistent, interpretable decompositions across synthetic benchmarks and real-world tasks, including bandit settings and QA abstention. Results show that epistemic uncertainty guides exploration and that aleatoric uncertainty is informative for abstention and OOD detection, highlighting the practical value of uncertainty decomposition in LLM-based ICL.

Abstract

As large language models (LLMs) gain popularity in conducting prediction tasks in-context, understanding the sources of uncertainty in in-context learning becomes essential to ensuring reliability. The recent hypothesis of in-context learning performing predictive Bayesian inference opens the avenue for Bayesian uncertainty estimation, particularly for decomposing uncertainty into epistemic uncertainty due to lack of in-context data and aleatoric uncertainty inherent in the in-context prediction task. However, the decomposition idea remains under-explored due to the intractability of the latent parameter posterior from the underlying Bayesian model. In this work, we introduce a variational uncertainty decomposition framework for in-context learning without explicitly sampling from the latent parameter posterior, by optimising auxiliary queries as probes to obtain an upper bound to the aleatoric uncertainty of an LLM's in-context learning procedure, which also induces a lower bound to the epistemic uncertainty. Through experiments on synthetic and real-world tasks, we show quantitatively and qualitatively that the decomposed uncertainties obtained from our method exhibit desirable properties of epistemic and aleatoric uncertainty.

Paper Structure

This paper contains 45 sections, 8 theorems, 61 equations, 67 figures, 12 tables, 5 algorithms.

Key Result

theorem 3.1

If the conditional independence relations in $\mathcal{G}$ hold, then the variational estimator provides an upper-bound to the aleatoric uncertainty: where the gap between $U_a(\mathbf{y}^* | \bm{x}^*, \mathcal{D})$ and $V_a(\mathbf{y}^* | \bm{x}^*, \mathcal{D})$ is:

Figures (67)

  • Figure 1: Uncertainty Decomposition with Auxiliary Data (Above). Decomposition Example for Two-Moons Dataset (Below).
  • Figure 2: The DAG $\mathcal{G}$ of the conditional independence assumptions.
  • Figure 3: Variational Uncertainty Decomposition (VUD) Framework.
  • Figure 4: Uncertainty Decompositions for Logistic and Linear Regressions.
  • Figure 5: Uncertainty Decompositions for Regression Tasks with Gaps in ICL Data.
  • ...and 62 more figures

Theorems & Definitions (13)

  • theorem 3.1: Aleatoric Uncertainty Upper-Bound
  • theorem 3.2: Aleatoric Variance Upper-Bound
  • theorem A.1: Aleatoric Uncertainty Upper-Bound
  • proof
  • proof : Alternative Proof
  • lemma A.1
  • proof
  • theorem A.1: Aleatoric Variance Upper-Bound
  • proof
  • theorem D.1: de Finetti's representation theorem
  • ...and 3 more