Table of Contents
Fetching ...

Conformal Risk Training: End-to-End Optimization of Conformal Risk Control

Christopher Yeh, Nicolas Christianson, Adam Wierman, Yisong Yue

TL;DR

This work extends conformal risk control (CRC) from controlling the expected loss to a broad class of optimized certainty equivalent (OCE) risks, enabling provable tail-risk guarantees such as CVaR. It introduces conformal risk training (CRT), an end-to-end framework that differentiates through the risk-control mechanism during model training, yielding models that perform better on average while satisfying risk constraints. The authors derive CORC and CVaR-specific results, provide gradient computation strategies for the inner optimization, and demonstrate substantial improvements over post-hoc risk control in tumor segmentation and battery storage applications. Overall, this approach offers a practical, risk-aware training paradigm for high-stakes ML tasks with concrete guarantees and improved downstream utility.

Abstract

While deep learning models often achieve high predictive accuracy, their predictions typically do not come with any provable guarantees on risk or reliability, which are critical for deployment in high-stakes applications. The framework of conformal risk control (CRC) provides a distribution-free, finite-sample method for controlling the expected value of any bounded monotone loss function and can be conveniently applied post-hoc to any pre-trained deep learning model. However, many real-world applications are sensitive to tail risks, as opposed to just expected loss. In this work, we develop a method for controlling the general class of Optimized Certainty-Equivalent (OCE) risks, a broad class of risk measures which includes as special cases the expected loss (generalizing the original CRC method) and common tail risks like the conditional value-at-risk (CVaR). Furthermore, standard post-hoc CRC can degrade average-case performance due to its lack of feedback to the model. To address this, we introduce "conformal risk training," an end-to-end approach that differentiates through conformal OCE risk control during model training or fine-tuning. Our method achieves provable risk guarantees while demonstrating significantly improved average-case performance over post-hoc approaches on applications to controlling classifiers' false negative rate and controlling financial risk in battery storage operation.

Conformal Risk Training: End-to-End Optimization of Conformal Risk Control

TL;DR

This work extends conformal risk control (CRC) from controlling the expected loss to a broad class of optimized certainty equivalent (OCE) risks, enabling provable tail-risk guarantees such as CVaR. It introduces conformal risk training (CRT), an end-to-end framework that differentiates through the risk-control mechanism during model training, yielding models that perform better on average while satisfying risk constraints. The authors derive CORC and CVaR-specific results, provide gradient computation strategies for the inner optimization, and demonstrate substantial improvements over post-hoc risk control in tumor segmentation and battery storage applications. Overall, this approach offers a practical, risk-aware training paradigm for high-stakes ML tasks with concrete guarantees and improved downstream utility.

Abstract

While deep learning models often achieve high predictive accuracy, their predictions typically do not come with any provable guarantees on risk or reliability, which are critical for deployment in high-stakes applications. The framework of conformal risk control (CRC) provides a distribution-free, finite-sample method for controlling the expected value of any bounded monotone loss function and can be conveniently applied post-hoc to any pre-trained deep learning model. However, many real-world applications are sensitive to tail risks, as opposed to just expected loss. In this work, we develop a method for controlling the general class of Optimized Certainty-Equivalent (OCE) risks, a broad class of risk measures which includes as special cases the expected loss (generalizing the original CRC method) and common tail risks like the conditional value-at-risk (CVaR). Furthermore, standard post-hoc CRC can degrade average-case performance due to its lack of feedback to the model. To address this, we introduce "conformal risk training," an end-to-end approach that differentiates through conformal OCE risk control during model training or fine-tuning. Our method achieves provable risk guarantees while demonstrating significantly improved average-case performance over post-hoc approaches on applications to controlling classifiers' false negative rate and controlling financial risk in battery storage operation.

Paper Structure

This paper contains 31 sections, 14 theorems, 67 equations, 5 figures, 3 tables, 3 algorithms.

Key Result

Proposition 1

Under as:crc-genericas:nondecreasing, let $\alpha \in \mathbb{R}$ be a desired risk level and define the set $\hat{\Lambda}$ as Then, for any $\lambda \in \hat{\Lambda}$, we have $\mathop{\mathrm{\mathbb{E}}}\nolimits[L_{N+1}(\lambda)] \leq \alpha$. Furthermore, if as:crc-alpha-feasible holds, then choosing $\hat{\lambda} \coloneqq \max\{\lambda_{\min}, \sup \hat{\Lambda}\}$ ensures risk control:

Figures (5)

  • Figure 1: Results for the tumor image segmentation problem (\ref{['sec:experiments_polyps']}) across different FNR thresholds $\alpha$ with 10 random seeds. (left) All three methods show FNR controlled at level $\alpha$. (middle) Whereas only applying CRC post-hoc results in very large FPR, our method (conformal risk training, in green) is able to significantly reduce FPR without sacrificing FNR. (right) Our method generally picks a higher classification threshold $\lambda$ than the baselines, suggesting that it is less conservative.
  • Figure 2: Results from the battery storage problem (\ref{['sec:experiments_battery']}) across different CVaR quantile levels $\delta$ and risk control thresholds $\alpha$, with 10 random seeds. (top) All three methods show CVaR risk controlled at the target level $\alpha$. (bottom) Comparison of the relative increase in profit (i.e., negative task loss) achieved by different methods over the post-hoc CRC baseline. Higher values are better.
  • Figure 3: Predictions on 8 randomly selected images from the test set for the tumor image segmentation problem (\ref{['sec:experiments_polyps']}). Black and white pixels indicate correct outputs. False positives are shaded teal; false negatives are shaded red.
  • Figure 4: Values of the decision scaling factor $\lambda$ for the battery storage problem (\ref{['sec:experiments_battery']}) across different CVaR quantile levels $\delta$ and risk control thresholds $\alpha$, with 10 random seeds. Higher values indicate more aggressiveness and larger charge/discharge decisions.
  • Figure 5: Comparison of financial tail risk (top) and task loss (bottom) as a function of calibration set size on the battery storage problem (\ref{['sec:experiments_battery']}), across different CVaR quantile levels $\delta$ and risk control thresholds $\alpha$, with 10 random seeds. Here, post-hoc conformal CVaR control is applied to the pre-trained prediction model $\hat{y}_\theta$.

Theorems & Definitions (31)

  • Proposition 1
  • Definition 1: ben-tal_expected_1986ben-tal_old-new_2007
  • Definition 2: rockafellar_optimization_2000rockafellar_conditional_2002
  • Theorem 1
  • Theorem 2
  • Example 1
  • Theorem : Informal version of \ref{['thm:lambda-gradient']}, \ref{['appendix:lambda-gradient-proof']}
  • Example 2
  • Definition 3: bertsekas_convex_2009
  • Lemma 1
  • ...and 21 more