Anytime-Valid Conformal Risk Control

Bror Hultberg; Dave Zachariah; Antônio H. Ribeiro

Anytime-Valid Conformal Risk Control

Bror Hultberg, Dave Zachariah, Antônio H. Ribeiro

TL;DR

The paper extends conformal prediction by introducing anytime-valid risk control, guaranteeing with probability at least $1-\delta$ that the risk of a sequence of prediction sets remains below a target $\alpha$ for all calibration sizes $n$. It constructs explicit correction terms $\gamma_n$ using functions $h_{B,m,\delta}$ and $f_{B,m,\delta}$, yielding time-uniform bounds and asymptotic tightness. The framework handles distribution shift via importance weighting, with corrected thresholds based on $W_n$ and $m^*$ to maintain $\mathbb{E}_{P^*}[\ell(\mathcal{C}_{\lambda_n}(X),Y)\mid \mathcal{D}_n] \le \alpha$ for all $n$. The authors provide theoretical guarantees, a matching lower bound, and empirical demonstrations in synthetic settings and ImageNet, highlighting practical impact for sequential data and robust uncertainty quantification.

Abstract

Prediction sets provide a means of quantifying the uncertainty in predictive tasks. Using held out calibration data, conformal prediction and risk control can produce prediction sets that exhibit statistically valid error control in a computationally efficient manner. However, in the standard formulations, the error is only controlled on average over many possible calibration datasets of fixed size. In this paper, we extend the control to remain valid with high probability over a cumulatively growing calibration dataset at any time point. We derive such guarantees using quantile-based arguments and illustrate the applicability of the proposed framework to settings involving distribution shift. We further establish a matching lower bound and show that our guarantees are asymptotically tight. Finally, we demonstrate the practical performance of our methods through both simulations and real-world numerical examples.

Anytime-Valid Conformal Risk Control

TL;DR

The paper extends conformal prediction by introducing anytime-valid risk control, guaranteeing with probability at least

that the risk of a sequence of prediction sets remains below a target

for all calibration sizes

. It constructs explicit correction terms

using functions

and

, yielding time-uniform bounds and asymptotic tightness. The framework handles distribution shift via importance weighting, with corrected thresholds based on

and

to maintain

for all

. The authors provide theoretical guarantees, a matching lower bound, and empirical demonstrations in synthetic settings and ImageNet, highlighting practical impact for sequential data and robust uncertainty quantification.

Abstract

Paper Structure (17 sections, 10 theorems, 91 equations, 5 figures)

This paper contains 17 sections, 10 theorems, 91 equations, 5 figures.

Introduction
Problem Formulation
Risk of a Prediction Set
Conformal Risk Control
Risk Control with Anytime Validity
Related Work
Main Results
Anytime-Valid Conformal Risk Control
Anytime-Valid Risk Control under Distribution Shift
Derivations
Numerical Experiments
Synthetic Regression Example
Synthetic Regression under Distribution Shift
Image Classification
Conclusion
...and 2 more sections

Key Result

Theorem 4.1

Let $\ell$ be an arbitrary loss function bounded in $[0,B]$, monotone in $\lambda$, and right-continuous in $\lambda$. Construct the prediction sets according eq:correctionterm and set the correction term as where Then the resulting sequence of prediction sets $\{ \mathcal{C}_{\lambda_n}(X) \}$ achieves anytime-valid risk control eq:riskcontrol_dataconditional_anytime.

Figures (5)

Figure 1: Miscoverage rates of prediction sets $\{ \mathcal{C}_n(X) \}$ versus calibration sample size $n$. Each line corresponds to a particular draw of calibration data and the miscoverage rates are evaluated conditional on this data at each $n$. A green line indicates that all the rates fall below a specified level $\alpha = 5 \%$, while red indicates a failure to achieve this. The anytime-valid method proposed herein is ensured to achieve it for $1-\delta = 90\%$ of all draws of calibration data. Details are given in Section \ref{['sec:synthetic']}
Figure 2: Fixed-time valid method with $\delta=10\%$.
Figure 3: Miscoverage rates of prediction sets $\{ \mathcal{C}_n(X) \}$ versus calibration sample size $n$ under a distribution shift.
Figure 4: Miscoverage rates of prediction sets $\{ \mathcal{C}_n(X) \}$ versus calibration sample size $n$.
Figure 5: Expected sizes of prediction sets as a function of calibration sample size for ImageNet-1K classification. The solid blue line shows sets constructed according to Corollary \ref{['cor:anytimeconfpred']}, while the dashed red line shows sets constructed using Equation \ref{['eq:duchi']}.

Theorems & Definitions (28)

Example 2.1
Example 2.2
Example 2.3
Example 2.4
Example 2.5
Remark 2.6
Definition 2.7
Theorem 4.1
Corollary 4.2
Remark 4.3
...and 18 more

Anytime-Valid Conformal Risk Control

TL;DR

Abstract

Anytime-Valid Conformal Risk Control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (28)