From Conformal Predictions to Confidence Regions

Charles Guille-Escuret; Eugene Ndiaye

From Conformal Predictions to Confidence Regions

Charles Guille-Escuret, Eugene Ndiaye

TL;DR

CCR extends conformal prediction to the parameter space by aggregating finite-sample CP intervals for noise-free outputs, yielding finite-sample valid confidence regions for model parameters under minimal noise assumptions. It constructs Θ_k via aggregation over individual θ-sets Θ(X_i) with voting-based constraints, and provides bounds under fully black-box, randomized Markov, and split-conformal regimes, including a MILP-compatible formulation for linear models. The method handles heteroskedastic and non-Gaussian noise, offers PAC-type guarantees, and enables practical downstream tasks such as robust optimization and regression abstention. Empirically, CCR demonstrates competitive coverage and tighter coordinate intervals relative to existing conformal-based approaches, while maintaining finite-sample validity and enabling hypothesis testing of linearity.

Abstract

Conformal prediction methodologies have significantly advanced the quantification of uncertainties in predictive models. Yet, the construction of confidence regions for model parameters presents a notable challenge, often necessitating stringent assumptions regarding data distribution or merely providing asymptotic guarantees. We introduce a novel approach termed CCR, which employs a combination of conformal prediction intervals for the model outputs to establish confidence regions for model parameters. We present coverage guarantees under minimal assumptions on noise and that is valid in finite sample regime. Our approach is applicable to both split conformal predictions and black-box methodologies including full or cross-conformal approaches. In the specific case of linear models, the derived confidence region manifests as the feasible set of a Mixed-Integer Linear Program (MILP), facilitating the deduction of confidence intervals for individual parameters and enabling robust optimization. We empirically compare CCR to recent advancements in challenging settings such as with heteroskedastic and non-Gaussian noise.

From Conformal Predictions to Confidence Regions

TL;DR

Abstract

Paper Structure (37 sections, 18 theorems, 87 equations, 5 figures, 4 tables)

This paper contains 37 sections, 18 theorems, 87 equations, 5 figures, 4 tables.

Introduction
Background
Contributions
Confidence Set for Noise-free Outputs
Confidence Set for the Model Parameter
Aggregation by Voting
Fully Black-Box
Randomized Markov's Inequality
Worst-Case Dependency
Split Conformal Prediction
Related Work
Conformal prediction
Confidence regions
Applications
Mixed Integer Linear Program
...and 22 more sections

Key Result

Proposition 2.2

Under the model in eq:model, assum:gamma_is_interval and assum:noise, it holds

Figures (5)

Figure 2: Illustration of the bounds covering each coordinate of the ground-truth parameter $\theta_\star$ under various configurations of noise, for $\beta=0.1$. Squares (resp. circles) correspond to upper bounds (resp. lower bounds).
Figure 3: Regression with rejection option with $Y_i = \sin(X_i) + \frac{\pi |X_i|}{20} \xi_i$ where $\xi_i \sim \mathcal{N}(0, 1)$.
Figure 4: Limitations of CCR compared to classical approach when the model and the data distribution behave nicely. One can observe a great advantage of the classical method that reduces the variance and shrink to the ground-truth as we collect more data.
Figure 5: Different noise setting where the classical confidence set estimate the standard deviation. In all setting, we use the median as the base predictor for conformal prediction. We recall that in these examples, the classical strategy have no validity. Our approach start by establishing validity first and then proceed to improve efficiency. As one can observed in the case of Pareto distribution, the standard confidence set can be misleading and this is hard to spot with real dataset since the ground-truth is never known. By setting $b=0.5$, our method is simultaneously valid for any symmetric noise.
Figure 6: Guaranteed coverage with the PAC bounds when the tolerance level $\delta$ varies. We fix $n=30$ and $n_{\mathrm{cal}} = 50$. The function $H(k)$ corresponds to the lower bound on the expected coverage.

Theorems & Definitions (30)

Proposition 2.2
Proposition 3.1
Proposition 3.1
Lemma 3.1
Proposition 3.2
Proposition 3.3
Remark 3.4: PAC Bounds
Proposition 5.0
Proposition 5.0
Proposition C.0
...and 20 more

From Conformal Predictions to Confidence Regions

TL;DR

Abstract

From Conformal Predictions to Confidence Regions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (30)