Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach
Linyu Liu, Yu Pan, Xiaocheng Li, Guanting Chen
TL;DR
The paper formalizes uncertainty estimation for LLMs as a supervised regression problem using labeled response-quality metrics. It proposes a pipeline that extracts features from white-box hidden activations and grey-box probability/entropy signals, enabling three regimes: white-box, grey-box, and black-box uncertainty estimation. Empirical results show the supervised approach generally outperforms unsupervised baselines across QA, translation, and MMLU tasks, with hidden activations providing notably strong signals and transferability to out-of-distribution settings. The work also clarifies the distinction between uncertainty estimation and calibration and demonstrates practical applicability to closed-source LLMs via black-box estimation.
Abstract
In this paper, we study the problem of uncertainty estimation and calibration for LLMs. We begin by formulating the uncertainty estimation problem, a relevant yet underexplored area in existing literature. We then propose a supervised approach that leverages labeled datasets to estimate the uncertainty in LLMs' responses. Based on the formulation, we illustrate the difference between the uncertainty estimation for LLMs and that for standard ML models and explain why the hidden neurons of the LLMs may contain uncertainty information. Our designed approach demonstrates the benefits of utilizing hidden activations to enhance uncertainty estimation across various tasks and shows robust transferability in out-of-distribution settings. We distinguish the uncertainty estimation task from the uncertainty calibration task and show that better uncertainty estimation leads to better calibration performance. Furthermore, our method is easy to implement and adaptable to different levels of model accessibility including black box, grey box, and white box.
