Table of Contents
Fetching ...

Adaptive Coverage Policies in Conformal Prediction

Etienne Gauthier, Francis Bach, Michael I. Jordan

TL;DR

The paper tackles the rigidity of fixed coverage in conformal prediction by introducing data-dependent miscoverage levels via conformal e-prediction with post-hoc validity, enabling instance-specific prediction-set sizes. It learns a coverage policy using a leave-one-out calibration scheme and trains a neural network to map calibration-score sums and test statistics to a data-driven miscoverage level $\tilde{\alpha}$, while regularizing to keep prediction sets compact. The approach provides marginal coverage guarantees even when $\tilde{\alpha}$ is chosen adaptively and demonstrates practical gains on CIFAR-10, where adaptive sets are smaller than fixed e-based baselines without sacrificing validity. This yields a flexible, distribution-free uncertainty quantification method suitable for per-instance uncertainty management in real-world ML applications.

Abstract

Traditional conformal prediction methods construct prediction sets such that the true label falls within the set with a user-specified coverage level. However, poorly chosen coverage levels can result in uninformative predictions, either producing overly conservative sets when the coverage level is too high, or empty sets when it is too low. Moreover, the fixed coverage level cannot adapt to the specific characteristics of each individual example, limiting the flexibility and efficiency of these methods. In this work, we leverage recent advances in e-values and post-hoc conformal inference, which allow the use of data-dependent coverage levels while maintaining valid statistical guarantees. We propose to optimize an adaptive coverage policy by training a neural network using a leave-one-out procedure on the calibration set, allowing the coverage level and the resulting prediction set size to vary with the difficulty of each individual example. We support our approach with theoretical coverage guarantees and demonstrate its practical benefits through a series of experiments.

Adaptive Coverage Policies in Conformal Prediction

TL;DR

The paper tackles the rigidity of fixed coverage in conformal prediction by introducing data-dependent miscoverage levels via conformal e-prediction with post-hoc validity, enabling instance-specific prediction-set sizes. It learns a coverage policy using a leave-one-out calibration scheme and trains a neural network to map calibration-score sums and test statistics to a data-driven miscoverage level , while regularizing to keep prediction sets compact. The approach provides marginal coverage guarantees even when is chosen adaptively and demonstrates practical gains on CIFAR-10, where adaptive sets are smaller than fixed e-based baselines without sacrificing validity. This yields a flexible, distribution-free uncertainty quantification method suitable for per-instance uncertainty management in real-world ML applications.

Abstract

Traditional conformal prediction methods construct prediction sets such that the true label falls within the set with a user-specified coverage level. However, poorly chosen coverage levels can result in uninformative predictions, either producing overly conservative sets when the coverage level is too high, or empty sets when it is too low. Moreover, the fixed coverage level cannot adapt to the specific characteristics of each individual example, limiting the flexibility and efficiency of these methods. In this work, we leverage recent advances in e-values and post-hoc conformal inference, which allow the use of data-dependent coverage levels while maintaining valid statistical guarantees. We propose to optimize an adaptive coverage policy by training a neural network using a leave-one-out procedure on the calibration set, allowing the coverage level and the resulting prediction set size to vary with the difficulty of each individual example. We support our approach with theoretical coverage guarantees and demonstrate its practical benefits through a series of experiments.

Paper Structure

This paper contains 12 sections, 5 theorems, 50 equations, 6 figures, 1 table, 2 algorithms.

Key Result

Proposition 2.2

Consider a calibration set $\{(X_i,Y_i)\}_{i=1}^n$ and a test data point $(X_{\rm test},Y_{\rm test})$ such that $(X_1,Y_1),\dotsc,(X_n,Y_n),(X_{\rm test},Y_{\rm test})$ are exchangeable. Let $\tilde{\alpha} > 0$ be any miscoverage level that may depend on this data. Then we have that: where When $\tilde{\alpha}$ is a fixed constant independent of the data, the guarantee (eq:posthoc) reduces to

Figures (6)

  • Figure 1: Training curves for $\lambda \in \{5,10,50\}$, averaged over 5 runs and smoothed with a moving average of size 50 for clarity. Shaded regions show $\pm1$ standard deviation across runs. Left: training loss. Center: mean set size. Right: mean adaptive miscoverage $\tilde{\alpha}$.
  • Figure 2: Distribution of adaptive miscoverage $\tilde{\alpha}$ across 100 randomly sampled test points.
  • Figure 3: Distribution of conformal set sizes across 100 test points for the three methods.
  • Figure 4: Evolution of $\lambda$ and mean set size during Algorithm \ref{['algorithm_lambda']}, showing the bracketing and bisection phases converging to the target $M=2$.
  • Figure 5: Visualization of the synthetic regression dataset.
  • ...and 1 more figures

Theorems & Definitions (11)

  • Definition 2.1: E-variable
  • Proposition 2.2: gauthier2025evaluesexpandscopeconformal
  • Remark 2.3: Conformal set size in classification
  • Remark 2.4: Conformal set size in regression
  • Definition 2.5: Coverage policy
  • Theorem 2.6: Leave-one-out proxy under constant $\alpha$
  • Proposition 2.7: Monotonicity of leave-one-out size under constant $\alpha$
  • Theorem A.1: Consistency of leave-one-out size
  • proof
  • Proposition B.1: Monotonicity of leave-one-out size under constant $\alpha$
  • ...and 1 more