Leave-One-Out Stable Conformal Prediction
Kiljae Lee, Yuan Zhang
TL;DR
This work addresses the computational bottleneck of full conformal prediction when multiple predictions are required by introducing Leave-One-Out Stable Conformal Prediction (LOO-StabCP), which uses leave-one-out stability to correct a single model fit on the training data. The method preserves finite-sample coverage $\mathbb{P}(Y_{n+j} \in \mathcal{C}^{\mathrm{LOO}}_{j,\alpha}(X_{n+j})) \ge 1-\alpha$ while dramatically reducing the need to refit models for each test point, outperforming RO-StabCP in large-scale prediction tasks. The authors derive LOO stability bounds for Regularized Loss Minimization (RLM) and Stochastic Gradient Descent (SGD), extend the framework to kernel methods, neural networks, and bagging, and validate the approach through simulations and real data, including a conformalized screening application (LOO-cfBH) that improves power under FDR control. Overall, LOO-StabCP enables scalable, distribution-free uncertainty quantification with strong practical impact for large-scale prediction and screening tasks.
Abstract
Conformal prediction (CP) is an important tool for distribution-free predictive uncertainty quantification. Yet, a major challenge is to balance computational efficiency and prediction accuracy, particularly for multiple predictions. We propose Leave-One-Out Stable Conformal Prediction (LOO-StabCP), a novel method to speed up full conformal using algorithmic stability without sample splitting. By leveraging leave-one-out stability, our method is much faster in handling a large number of prediction requests compared to existing method RO-StabCP based on replace-one stability. We derived stability bounds for several popular machine learning tools: regularized loss minimization (RLM) and stochastic gradient descent (SGD), as well as kernel method, neural networks and bagging. Our method is theoretically justified and demonstrates superior numerical performance on synthetic and real-world data. We applied our method to a screening problem, where its effective exploitation of training data led to improved test power compared to state-of-the-art method based on split conformal.
