Backward Conformal Prediction
Etienne Gauthier, Francis Bach, Michael I. Jordan
TL;DR
Backward Conformal Prediction (BCP) addresses the fixed-coverage limitation of standard conformal prediction by enforcing a data-driven size constraint on prediction sets via $\mathcal{T}$ while preserving marginal coverage guarantees through e-values. The framework pairs conformal e-prediction with an adaptive miscoverage level $\tilde{\alpha}$ and introduces a leave-one-out estimator $\hat{\alpha}^{\rm LOO}$ to estimate $\mathbb{E}[\tilde{\alpha}]$ from calibration data, making the guarantees practically computable. Theoretical results show $|\hat{\alpha}^{\rm LOO} - \mathbb{E}[\tilde{\alpha}]| = O_P(1/\sqrt{n})$ under mild conditions, with experiments on CIFAR-10 and a medical dataset illustrating effective size control and reliable coverage. This approach offers a flexible, interpretable uncertainty quantification tool for high-stakes domains where small, informative prediction sets are crucial.
Abstract
We introduce $\textit{Backward Conformal Prediction}$, a method that guarantees conformal coverage while providing flexible control over the size of prediction sets. Unlike standard conformal prediction, which fixes the coverage level and allows the conformal set size to vary, our approach defines a rule that constrains how prediction set sizes behave based on the observed data, and adapts the coverage level accordingly. Our method builds on two key foundations: (i) recent results by Gauthier et al. [2025] on post-hoc validity using e-values, which ensure marginal coverage of the form $\mathbb{P}(Y_{\rm test} \in \hat C_n^{\tildeα}(X_{\rm test})) \ge 1 - \mathbb{E}[\tildeα]$ up to a first-order Taylor approximation for any data-dependent miscoverage $\tildeα$, and (ii) a novel leave-one-out estimator $\hatα^{\rm LOO}$ of the marginal miscoverage $\mathbb{E}[\tildeα]$ based on the calibration set, ensuring that the theoretical guarantees remain computable in practice. This approach is particularly useful in applications where large prediction sets are impractical such as medical diagnosis. We provide theoretical results and empirical evidence supporting the validity of our method, demonstrating that it maintains computable coverage guarantees while ensuring interpretable, well-controlled prediction set sizes.
