Variational Uncertainty Decomposition for In-Context Learning
I. Shavindra Jayasekera, Jacob Si, Filippo Valdettaro, Wenlong Chen, A. Aldo Faisal, Yingzhen Li
TL;DR
The paper tackles uncertainty in in-context learning by treating ICL as an implicit Bayesian process and introducing a Variational Uncertainty Decomposition (VUD) framework. VUD constructs variational upper bounds on aleatoric uncertainty using optimisable auxiliary queries Z, which simultaneously yields a lower bound on epistemic uncertainty, without requiring explicit posterior samples of the latent parameter θ. The method leverages permutation-based ensembling and KL-filtering to promote exchangeability and demonstrates consistent, interpretable decompositions across synthetic benchmarks and real-world tasks, including bandit settings and QA abstention. Results show that epistemic uncertainty guides exploration and that aleatoric uncertainty is informative for abstention and OOD detection, highlighting the practical value of uncertainty decomposition in LLM-based ICL.
Abstract
As large language models (LLMs) gain popularity in conducting prediction tasks in-context, understanding the sources of uncertainty in in-context learning becomes essential to ensuring reliability. The recent hypothesis of in-context learning performing predictive Bayesian inference opens the avenue for Bayesian uncertainty estimation, particularly for decomposing uncertainty into epistemic uncertainty due to lack of in-context data and aleatoric uncertainty inherent in the in-context prediction task. However, the decomposition idea remains under-explored due to the intractability of the latent parameter posterior from the underlying Bayesian model. In this work, we introduce a variational uncertainty decomposition framework for in-context learning without explicitly sampling from the latent parameter posterior, by optimising auxiliary queries as probes to obtain an upper bound to the aleatoric uncertainty of an LLM's in-context learning procedure, which also induces a lower bound to the epistemic uncertainty. Through experiments on synthetic and real-world tasks, we show quantitatively and qualitatively that the decomposed uncertainties obtained from our method exhibit desirable properties of epistemic and aleatoric uncertainty.
