Table of Contents
Fetching ...

Automatically Adaptive Conformal Risk Control

Vincent Blot, Anastasios N Angelopoulos, Michael I Jordan, Nicolas J-B Brunel

TL;DR

This work introduces automatically adaptive conformal risk control (AA-CRC), a framework that extends conformal risk control to adapt uncertainty to input difficulty by learning an automatic conditioning function class. It builds on conformal prediction and the multiaccuracy perspective to provide conditional risk guarantees that are robust to covariate shifts represented in an embedding space or group structure, while handling label-conditional coverage. Theoretical results connect a tilted risk bound $\mathbb{E}_{\lambda}[\ell(Y_{n+1},\mathcal{C}(X_{n+1}))]$ to the choice of $\lambda$ and a regularizer, enabling efficient computation via surrogate objectives $\tilde{J}(\lambda)$. Empirically, AA-CRC improves precision at fixed recall in semantic segmentation (polyp and fire datasets) by learning adaptive thresholds from embeddings or random-forest leaves, illustrating practical gains in uncertainty quantification for black-box models with controlled risk. The approach offers a scalable, data-driven path to automatic conditional guarantees in high-dimensional and structured prediction tasks.

Abstract

Science and technology have a growing need for effective mechanisms that ensure reliable, controlled performance from black-box machine learning algorithms. These performance guarantees should ideally hold conditionally on the input-that is the performance guarantees should hold, at least approximately, no matter what the input. However, beyond stylized discrete groupings such as ethnicity and gender, the right notion of conditioning can be difficult to define. For example, in problems such as image segmentation, we want the uncertainty to reflect the intrinsic difficulty of the test sample, but this may be difficult to capture via a conditioning event. Building on the recent work of Gibbs et al. [2023], we propose a methodology for achieving approximate conditional control of statistical risks-the expected value of loss functions-by adapting to the difficulty of test samples. Our framework goes beyond traditional conditional risk control based on user-provided conditioning events to the algorithmic, data-driven determination of appropriate function classes for conditioning. We apply this framework to various regression and segmentation tasks, enabling finer-grained control over model performance and demonstrating that by continuously monitoring and adjusting these parameters, we can achieve superior precision compared to conventional risk-control methods.

Automatically Adaptive Conformal Risk Control

TL;DR

This work introduces automatically adaptive conformal risk control (AA-CRC), a framework that extends conformal risk control to adapt uncertainty to input difficulty by learning an automatic conditioning function class. It builds on conformal prediction and the multiaccuracy perspective to provide conditional risk guarantees that are robust to covariate shifts represented in an embedding space or group structure, while handling label-conditional coverage. Theoretical results connect a tilted risk bound to the choice of and a regularizer, enabling efficient computation via surrogate objectives . Empirically, AA-CRC improves precision at fixed recall in semantic segmentation (polyp and fire datasets) by learning adaptive thresholds from embeddings or random-forest leaves, illustrating practical gains in uncertainty quantification for black-box models with controlled risk. The approach offers a scalable, data-driven path to automatic conditional guarantees in high-dimensional and structured prediction tasks.

Abstract

Science and technology have a growing need for effective mechanisms that ensure reliable, controlled performance from black-box machine learning algorithms. These performance guarantees should ideally hold conditionally on the input-that is the performance guarantees should hold, at least approximately, no matter what the input. However, beyond stylized discrete groupings such as ethnicity and gender, the right notion of conditioning can be difficult to define. For example, in problems such as image segmentation, we want the uncertainty to reflect the intrinsic difficulty of the test sample, but this may be difficult to capture via a conditioning event. Building on the recent work of Gibbs et al. [2023], we propose a methodology for achieving approximate conditional control of statistical risks-the expected value of loss functions-by adapting to the difficulty of test samples. Our framework goes beyond traditional conditional risk control based on user-provided conditioning events to the algorithmic, data-driven determination of appropriate function classes for conditioning. We apply this framework to various regression and segmentation tasks, enabling finer-grained control over model performance and demonstrating that by continuously monitoring and adjusting these parameters, we can achieve superior precision compared to conventional risk-control methods.

Paper Structure

This paper contains 12 sections, 1 theorem, 21 equations, 6 figures, 1 algorithm.

Key Result

Theorem 1

Consider a vector space $\Lambda$ equipped with the standard addition operation, and assume that for all $\lambda, \lambda' \in \Lambda$, the derivative $\epsilon \mapsto \mathcal{R}(\lambda + \epsilon \lambda')$ exists. If $\lambda$ is nonnegative and $\mathbb{E}[\lambda(X_{n+1},Y_{n+1})] > 0$, the

Figures (6)

  • Figure 1: Example of polyp segmentations with conformal risk control (CRC) and our methodology (AA-CRC), where the true positive pixels are in white and the false positives in blue. To guarantee the recall on the image, our method outputs a threshold equal to 0.304 and 0.330 while the constant threshold of the CRC methodology is 0.276. This difference implies a higher precision for our methodology.
  • Figure 2: Top figure. The blue curve is the model prediction, blue dots are test data points, and prediction intervals are shown in orange. The dotted lines represent the CRC prediction intervals which have constant width. Bottom figure. It shows the within-group coverage for each of the adaptively selected groups. The red line is the target coverage level. The coverage is almost exact for all groups for AA-CRC while almost all groups are either undercovered or overcovered this the standard CRC method.
  • Figure 3: Procedure to create the embedding of the images. The first step is the training of the segmentation model on the $\mathcal{D}_{train}$ dataset. The second step is the learning of the embedding based on the segmentation output on the $\mathcal{D}_{res}$ dataset. The third step is the solving of the optimization procedure and the $\mathcal{D}_{cal}$ dataset.
  • Figure 4: Top figure. Plot of the two first components of the PCA on the embeddings of the images. The color of each point correspond to the threshold returned by AA-CRC. Bottom figure. The right figure shows the within-group coverage for each of the adaptively selected groups. The red line is the target coverage level. The coverage is almost exact for all groups for AA-CRC while almost all groups are either undercovered or overcovered this the standard CRC method.
  • Figure 5: Recall control for polyp segmentation. The top figure compares the control of the recall made with our method (AA-CRC) to the control done with CRC. White pixels are true positives, blue pixels are false positives and red pixels are false negatives. The bottom figures represents the distribution of the recall of our procedure and distribution of the precision for both CRC and AA-CRC over 100 independent random data split.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof