Table of Contents
Fetching ...

Theoretical Foundations of Conformal Prediction

Anastasios N. Angelopoulos, Rina Foygel Barber, Stephen Bates

TL;DR

The book develops a rigorous, distribution-free foundation for conformal prediction, rooted in exchangeability and permutation tests, to yield finite-sample uncertainty guarantees for predictive sets without relying on distributional assumptions. It introduces split and full conformal prediction, details a family of conformal score functions (including residual, scaled residual, CQR, and high-probability scores), and analyzes their impact on coverage and efficiency. A central theme is the trade-off between marginal and conditional coverage, including hardness results for test-conditional guarantees in nonatomic settings and practical relaxations like binning and Mondrian conformal prediction. The final part connects conformal prediction to model-based reasoning, showing that with accurate prior models the conformal procedure can asymptotically approach oracle optimality while preserving marginal coverage under exchangeability, and discusses extensions to classification, localization, and robust conditioning. Collectively, this framework illuminates when conformal prediction can provide tight, interpretable, and distribution-free uncertainty sets in modern predictive pipelines, and how to incorporate prior structure for improved performance when assumptions are credible.

Abstract

This book is about conformal prediction and related inferential techniques that build on permutation tests and exchangeability. These techniques are useful in a diverse array of tasks, including hypothesis testing and providing uncertainty quantification guarantees for machine learning systems. Much of the current interest in conformal prediction is due to its ability to integrate into complex machine learning workflows, solving the problem of forming prediction sets without any assumptions on the form of the data generating distribution. Since contemporary machine learning algorithms have generally proven difficult to analyze directly, conformal prediction's main appeal is its ability to provide formal, finite-sample guarantees when paired with such methods. The goal of this book is to teach the reader about the fundamental technical arguments that arise when researching conformal prediction and related questions in distribution-free inference. Many of these proof strategies, especially the more recent ones, are scattered among research papers, making it difficult for researchers to understand where to look, which results are important, and how exactly the proofs work. We hope to bridge this gap by curating what we believe to be some of the most important results in the literature and presenting their proofs in a unified language, with illustrations, and with an eye towards pedagogy.

Theoretical Foundations of Conformal Prediction

TL;DR

The book develops a rigorous, distribution-free foundation for conformal prediction, rooted in exchangeability and permutation tests, to yield finite-sample uncertainty guarantees for predictive sets without relying on distributional assumptions. It introduces split and full conformal prediction, details a family of conformal score functions (including residual, scaled residual, CQR, and high-probability scores), and analyzes their impact on coverage and efficiency. A central theme is the trade-off between marginal and conditional coverage, including hardness results for test-conditional guarantees in nonatomic settings and practical relaxations like binning and Mondrian conformal prediction. The final part connects conformal prediction to model-based reasoning, showing that with accurate prior models the conformal procedure can asymptotically approach oracle optimality while preserving marginal coverage under exchangeability, and discusses extensions to classification, localization, and robust conditioning. Collectively, this framework illuminates when conformal prediction can provide tight, interpretable, and distribution-free uncertainty sets in modern predictive pipelines, and how to incorporate prior structure for improved performance when assumptions are credible.

Abstract

This book is about conformal prediction and related inferential techniques that build on permutation tests and exchangeability. These techniques are useful in a diverse array of tasks, including hypothesis testing and providing uncertainty quantification guarantees for machine learning systems. Much of the current interest in conformal prediction is due to its ability to integrate into complex machine learning workflows, solving the problem of forming prediction sets without any assumptions on the form of the data generating distribution. Since contemporary machine learning algorithms have generally proven difficult to analyze directly, conformal prediction's main appeal is its ability to provide formal, finite-sample guarantees when paired with such methods. The goal of this book is to teach the reader about the fundamental technical arguments that arise when researching conformal prediction and related questions in distribution-free inference. Many of these proof strategies, especially the more recent ones, are scattered among research papers, making it difficult for researchers to understand where to look, which results are important, and how exactly the proofs work. We hope to bridge this gap by curating what we believe to be some of the most important results in the literature and presenting their proofs in a unified language, with illustrations, and with an eye towards pedagogy.

Paper Structure

This paper contains 186 sections, 901 equations, 36 figures, 1 table.

Figures (36)

  • Figure 1: The conformal score function determines the shape of the sets. The shaded band is a visualization of the prediction set $\mathcal{C}(X_{n+1})\subseteq\mathcal{Y}$ as a function of $X_{n+1}\in\mathcal{X}$. On the left, the residual score gives a fixed-width band. In the middle, the scaled residual score gives a symmetric band that adapts to the non-constant noise variance. On the right, the CQR score gives an asymmetric band that follows the quantiles of the distribution.
  • Figure 2: Illustration of a permutation test for the equality of two real-valued distributions, where the test statistic used is the difference in means between two groups of data points, as in \ref{['eqn:perm_test_diff_means']}. In each plot, these two group means are shown as two dashed yellow lines. In the left plot, we show the values computed on the real ordering of the data $Z$. The middle and right plots show the values for two typical permutations $Z_\sigma$. The difference in means on the real data is far more extreme than on the permuted data, indicating evidence against the null hypothesis of exchangeability.
  • Figure 3: An illustration of two quantiles chosen on the CDF. The figure illustrates the empirical CDF of the vector $z=(1,1,2,3,4)$, and the calculation of $\mathrm{Quantile}(z;\tau)$, at $\tau = 0.5$ and $\tau=0.1$. This random vector has quantiles $\mathrm{Quantile}(z;0.5) = 2$ and $\mathrm{Quantile}(z;0.1) = 1$. We can see that $\widehat{F}_z(2) = 0.6$ (which is slightly larger than $\tau = 0.5$, due to discreteness), and $\widehat{F}_z(1) = 0.4$ (which is much larger than $\tau = 0.1$, due to the fact that the random vector has a tie at the value $1$).
  • Figure 5: Illustration of notation for a single hypothesized response $y$. This figure illustrates the definitions of Section \ref{['sec:define_conformal']}. Each of the dark gray dots is a data point, $(X_i, Y_i)$. The larger yellow dot is the hypothesized test point $(X_{n+1}, y)$. The regression model $\hat{f}(x ; \mathcal{D}^y_{n+1})$ is shown as a gray curve. Each score, $S_i^y$, is shown as a dotted line, representing the residual score as defined in \ref{['eq:absolute-residual']}---i.e., the absolute residual of the model $\hat{f}(x ; \mathcal{D}^y_{n+1})$ on the point $(X_i, Y_i)$ (or $(X_{n+1},y)$, for the case $i=n+1$). The quantile $\hat{q}^y$ is defined as in \ref{['eq:full-cp-quantile']}.
  • Figure 6: An illustration of full conformal prediction with the residual score function. On the left-hand side are four hypothesized response values $y$, i.e., four distinct iterations of the 'For' loop in Algorithm \ref{['alg:full-cp']}. In the center, for each possible value $y$ of the response, we display a smaller version of the plot in Figure \ref{['fig:full-cp-panel']}. Note that each center figure has a different fitted function $\hat{f}(\cdot; \mathcal{D}^y_{n+1})$, since changing the value $y$ has an effect on the regression function. For each value of $y$, the width of the gray shaded band indicates the conformal quantile $\hat{q}^y$, as in Figure \ref{['fig:full-cp-panel']}. Finally, on the right-hand side, the final prediction set $\mathcal{C}(X_{n+1})$ is shown in dark gray. The set contains all hypothesized response values $y$ whose residuals are no larger than the conformal quantile $\hat{q}^y$---that is, all values of $y$ for which the yellow data point, denoting $(X_{n+1},y)$, lies within the gray band.
  • ...and 31 more figures