Table of Contents
Fetching ...

Anytime-Valid Conformal Risk Control

Bror Hultberg, Dave Zachariah, Antônio H. Ribeiro

TL;DR

The paper extends conformal prediction by introducing anytime-valid risk control, guaranteeing with probability at least $1-\delta$ that the risk of a sequence of prediction sets remains below a target $\alpha$ for all calibration sizes $n$. It constructs explicit correction terms $\gamma_n$ using functions $h_{B,m,\delta}$ and $f_{B,m,\delta}$, yielding time-uniform bounds and asymptotic tightness. The framework handles distribution shift via importance weighting, with corrected thresholds based on $W_n$ and $m^*$ to maintain $\mathbb{E}_{P^*}[\ell(\mathcal{C}_{\lambda_n}(X),Y)\mid \mathcal{D}_n] \le \alpha$ for all $n$. The authors provide theoretical guarantees, a matching lower bound, and empirical demonstrations in synthetic settings and ImageNet, highlighting practical impact for sequential data and robust uncertainty quantification.

Abstract

Prediction sets provide a means of quantifying the uncertainty in predictive tasks. Using held out calibration data, conformal prediction and risk control can produce prediction sets that exhibit statistically valid error control in a computationally efficient manner. However, in the standard formulations, the error is only controlled on average over many possible calibration datasets of fixed size. In this paper, we extend the control to remain valid with high probability over a cumulatively growing calibration dataset at any time point. We derive such guarantees using quantile-based arguments and illustrate the applicability of the proposed framework to settings involving distribution shift. We further establish a matching lower bound and show that our guarantees are asymptotically tight. Finally, we demonstrate the practical performance of our methods through both simulations and real-world numerical examples.

Anytime-Valid Conformal Risk Control

TL;DR

The paper extends conformal prediction by introducing anytime-valid risk control, guaranteeing with probability at least that the risk of a sequence of prediction sets remains below a target for all calibration sizes . It constructs explicit correction terms using functions and , yielding time-uniform bounds and asymptotic tightness. The framework handles distribution shift via importance weighting, with corrected thresholds based on and to maintain for all . The authors provide theoretical guarantees, a matching lower bound, and empirical demonstrations in synthetic settings and ImageNet, highlighting practical impact for sequential data and robust uncertainty quantification.

Abstract

Prediction sets provide a means of quantifying the uncertainty in predictive tasks. Using held out calibration data, conformal prediction and risk control can produce prediction sets that exhibit statistically valid error control in a computationally efficient manner. However, in the standard formulations, the error is only controlled on average over many possible calibration datasets of fixed size. In this paper, we extend the control to remain valid with high probability over a cumulatively growing calibration dataset at any time point. We derive such guarantees using quantile-based arguments and illustrate the applicability of the proposed framework to settings involving distribution shift. We further establish a matching lower bound and show that our guarantees are asymptotically tight. Finally, we demonstrate the practical performance of our methods through both simulations and real-world numerical examples.
Paper Structure (17 sections, 10 theorems, 91 equations, 5 figures)

This paper contains 17 sections, 10 theorems, 91 equations, 5 figures.

Key Result

Theorem 4.1

Let $\ell$ be an arbitrary loss function bounded in $[0,B]$, monotone in $\lambda$, and right-continuous in $\lambda$. Construct the prediction sets according eq:correctionterm and set the correction term as where Then the resulting sequence of prediction sets $\{ \mathcal{C}_{\lambda_n}(X) \}$ achieves anytime-valid risk control eq:riskcontrol_dataconditional_anytime.

Figures (5)

  • Figure 1: Miscoverage rates of prediction sets $\{ \mathcal{C}_n(X) \}$ versus calibration sample size $n$. Each line corresponds to a particular draw of calibration data and the miscoverage rates are evaluated conditional on this data at each $n$. A green line indicates that all the rates fall below a specified level $\alpha = 5 \%$, while red indicates a failure to achieve this. The anytime-valid method proposed herein is ensured to achieve it for $1-\delta = 90\%$ of all draws of calibration data. Details are given in Section \ref{['sec:synthetic']}
  • Figure 2: Fixed-time valid method with $\delta=10\%$.
  • Figure 3: Miscoverage rates of prediction sets $\{ \mathcal{C}_n(X) \}$ versus calibration sample size $n$ under a distribution shift.
  • Figure 4: Miscoverage rates of prediction sets $\{ \mathcal{C}_n(X) \}$ versus calibration sample size $n$.
  • Figure 5: Expected sizes of prediction sets as a function of calibration sample size for ImageNet-1K classification. The solid blue line shows sets constructed according to Corollary \ref{['cor:anytimeconfpred']}, while the dashed red line shows sets constructed using Equation \ref{['eq:duchi']}.

Theorems & Definitions (28)

  • Example 2.1
  • Example 2.2
  • Example 2.3
  • Example 2.4
  • Example 2.5
  • Remark 2.6
  • Definition 2.7
  • Theorem 4.1
  • Corollary 4.2
  • Remark 4.3
  • ...and 18 more