Conformal Loss-Controlling Prediction

Di Wang; Ping Wang; Zhong Ji; Xiaojun Yang; Hongyue Li

Conformal Loss-Controlling Prediction

Di Wang, Ping Wang, Zhong Ji, Xiaojun Yang, Hongyue Li

TL;DR

This work addresses the need to control not just the coverage of prediction sets but the value of a general loss L on test objects. It introduces conformal loss-controlling prediction (CLCP), which selects a nesting parameter λ* to ensure P( L(Y_{n+1}, C_{λ*}(X_{n+1})) ≤ α ) ≥ 1 − δ under exchangeability, generalizing both inductive conformal prediction and conformal risk control. Theoretical guarantees are established, and CP is shown as a special case of CLCP; empirical validation covers a class-varying loss in classification and postprocessed weather forecasting, demonstrating practical effectiveness and the impact of underlying models on predictive efficiency. The framework enables robust, finite-sample control of general losses across diverse domains, including medical imaging and numerical weather prediction, by leveraging calibration data and nested prediction sets. Overall, CLCP provides a versatile, theoretically grounded approach to loss-controlled predictions with tangible applications and clear avenues for improving informational efficiency through algorithm design.

Abstract

Conformal prediction is a learning framework controlling prediction coverage of prediction sets, which can be built on any learning algorithm for point prediction. This work proposes a learning framework named conformal loss-controlling prediction, which extends conformal prediction to the situation where the value of a loss function needs to be controlled. Different from existing works about risk-controlling prediction sets and conformal risk control with the purpose of controlling the expected values of loss functions, the proposed approach in this paper focuses on the loss for any test object, which is an extension of conformal prediction from miscoverage loss to some general loss. The controlling guarantee is proved under the assumption of exchangeability of data in finite-sample cases and the framework is tested empirically for classification with a class-varying loss and statistical postprocessing of numerical weather forecasting applications, which are introduced as point-wise classification and point-wise regression problems. All theoretical analysis and experimental results confirm the effectiveness of our loss-controlling approach.

Conformal Loss-Controlling Prediction

TL;DR

Abstract

Paper Structure (12 sections, 1 theorem, 40 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 12 sections, 1 theorem, 40 equations, 8 figures, 1 table, 1 algorithm.

Introduction
Inductive Conformal Prediction and Conformal Risk Control
Inductive Conformal Prediction
Conformal Risk Control
Conformal Loss-Controlling Prediction And Its Theoretical Analysis
Experiments
CLCP for classification with a class-varying loss
CLCP for high-impact weather forecasting
Dataset for high temperature forecasting
Dataset for low temperature forecasting
CLCP for maximum temperature and minimum temperature forecasting
Conclusion

Key Result

Theorem 1

Suppose $\{(X_i, Y_i)\}_{i = 1}^{n+1}$ are $n+1$ data drawn exchangeably from $P_{XY}$ on $\mathcal{X} \times \mathcal{Y}$, $C_{\lambda}: \mathcal{X} \rightarrow \mathcal{Y}'$ is a set-valued function satisfying formula (1) with the parameter $\lambda$ taking values from a discrete set $\Lambda \sub then for any $\delta \in (\frac{1}{n+1},1)$, we have where $\lambda^*$ is defined as formula (5).

Figures (8)

Figure 1: Bar plots of the frequencies of the prediction losses being greater than $\alpha$ vs. $\delta = 0.05, 0.1, 0.15, 0.2$ on test data for classification with a class-varying loss. The first row corresponds to $\alpha = 0.1$ and the second row corresponds to $\alpha = 0.2$. Different columns represent different classifiers. All bars are near or below the preset $\delta$, which confirms the controlling guarantee of CLCP empirically.
Figure 2: Bar plots of the average sizes of prediction sets vs. $\delta = 0.05, 0.1, 0.15, 0.2$ on test data for classification with a class-varying loss. The first row corresponds to $\alpha = 0.1$ and the second row corresponds to $\alpha = 0.2$. Different columns represent different classifiers. The plots demonstrate the information in prediction sets. In general, large $\delta$ leads to small average size and different classifiers have different informational efficiency.
Figure 3: Bar plots of the frequencies of the prediction losses being greater than $\alpha$ vs. $\delta = 0.05, 0.1, 0.15, 0.2$ on test data for high-impact weather forecasting. The first row corresponds to HighTemp and the second row corresponds to LowTemp. Different columns represent different $\alpha$. All bars are near or below the preset $\delta$, which confirms the controlling guarantee of CLCP empirically.
Figure 4: Boxen plots of the prediction losses vs. $\delta = 0.05, 0.1, 0.15, 0.2$ on test data for high-impact weather forecasting. The first row corresponds to HighTemp and the second row corresponds to LowTemp. Different columns represent different $\alpha$. The loss distributions are controlled by $\alpha$ and $\delta$ properly to obtain the empirical validity in Fig. 3.
Figure 5: Boxen plots for the distributions of normalized sizes of prediction sets vs. $\delta = 0.05, 0.1, 0.15, 0.2$ on test data for high-impact weather forecasting. The first row corresponds to HighTemp and the second row corresponds to LowTemp. Different columns represent different $\alpha$. U-Net performs better than nDNN, which indicates the importance of careful design of the underlying algorithm.
...and 3 more figures

Theorems & Definitions (3)

Definition 1
Theorem 1
proof

Conformal Loss-Controlling Prediction

TL;DR

Abstract

Conformal Loss-Controlling Prediction

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (3)