Volume Optimality in Conformal Prediction with Structured Prediction Sets
Chao Gao, Liren Shan, Vaidehi Srinivas, Aravindan Vijayaraghavan
TL;DR
The paper tackles volume optimality in distribution-free conformal prediction by first proving an impossibility result for unrestricted volume minimization, then introducing restricted volume optimality within the class of unions of $k$ intervals, $\mathcal{C}_k$. It develops a dynamic-programming based conformity score (CP-DP) and shows how to conformalize it to obtain finite-sample coverage guarantees while achieving near-optimal volume within $\mathcal{C}_k$, including approximate conditional coverage when a conditional CDF estimator is available. In the supervised setting, it combines distributional conformal prediction (DCP) with DP to obtain marginal and (when feasible) approximate conditional coverage and conditional restricted volume optimality for $k$-interval prediction sets. Empirical results on synthetic data demonstrate that CP-DP substantially improves prediction-set volume over existing KDE- or density-based methods, particularly in multimodal scenarios, with polynomial-time DP algorithms facilitating scalable deployment.
Abstract
Conformal Prediction is a widely studied technique to construct prediction sets of future observations. Most conformal prediction methods focus on achieving the necessary coverage guarantees, but do not provide formal guarantees on the size (volume) of the prediction sets. We first prove an impossibility of volume optimality where any distribution-free method can only find a trivial solution. We then introduce a new notion of volume optimality by restricting the prediction sets to belong to a set family (of finite VC-dimension), specifically a union of $k$-intervals. Our main contribution is an efficient distribution-free algorithm based on dynamic programming (DP) to find a union of $k$-intervals that is guaranteed for any distribution to have near-optimal volume among all unions of $k$-intervals satisfying the desired coverage property. By adopting the framework of distributional conformal prediction (Chernozhukov et al., 2021), the new DP based conformity score can also be applied to achieve approximate conditional coverage and conditional restricted volume optimality, as long as a reasonable estimator of the conditional CDF is available. While the theoretical results already establish volume-optimality guarantees, they are complemented by experiments that demonstrate that our method can significantly outperform existing methods in many settings.
