From predictions to confidence intervals: an empirical study of conformal prediction methods for in-context learning
Zhe Huang, Simone Rossi, Rui Yuan, Thomas Hannagan
TL;DR
The paper tackles uncertainty quantification for in-context learning in transformers by marrying conformal prediction (CP) with in-context learning (ICL) to produce distribution-free prediction intervals with guaranteed coverage $1-\alpha$ in a single forward pass. It builds a bridge by using a pre-trained linear self-attention transformer to compute conformity scores across a finite grid of candidate labels $z$, enabling exact CP guarantees without retraining on augmented datasets. The authors benchmark CP with ICL against ridge-based oracle CP and split CP, showing robust coverage and competitive compute times, even under distribution shifts, and they uncover scaling laws that guide model size and data usage under compute budgets. The work provides a theoretically grounded, scalable framework that integrates ICL with CP for uncertainty quantification in transformer-based models, with potential extensions to autoregressive settings and real-world data.
Abstract
Transformers have become a standard architecture in machine learning, demonstrating strong in-context learning (ICL) abilities that allow them to learn from the prompt at inference time. However, uncertainty quantification for ICL remains an open challenge, particularly in noisy regression tasks. This paper investigates whether ICL can be leveraged for distribution-free uncertainty estimation, proposing a method based on conformal prediction to construct prediction intervals with guaranteed coverage. While traditional conformal methods are computationally expensive due to repeated model fitting, we exploit ICL to efficiently generate confidence intervals in a single forward pass. Our empirical analysis compares this approach against ridge regression-based conformal methods, showing that conformal prediction with in-context learning (CP with ICL) achieves robust and scalable uncertainty estimates. Additionally, we evaluate its performance under distribution shifts and establish scaling laws to guide model training. These findings bridge ICL and conformal prediction, providing a theoretically grounded and new framework for uncertainty quantification in transformer-based models.
