On Volume Minimization in Conformal Regression
Batiste Le Bars, Pierre Humbert
TL;DR
This work analyzes volume minimization within split conformal regression, framing the problem as a minimum-volume set under a coverage constraint. It shows that calibrating the split step yields an empirical solution to this objective and derives a finite-sample bound on the excess interval length. To achieve true volume efficiency, the paper introduces EffOrt, which learns the base predictor by minimizing the empirical $(1-\\alpha)$-QAE, followed by a calibration step, and extends to Ad-EffOrt for covariate-adaptive interval sizes. The empirical results across synthetic and real data demonstrate that EffOrt and Ad-EffOrt produce valid prediction sets with notably reduced lengths and lower variability compared to standard split CP and related adaptive methods. Overall, the approach highlights the critical role of the learning step in achieving efficient, distribution-free prediction sets with practical applicability to regression tasks.
Abstract
We study the question of volume optimality in split conformal regression, a topic still poorly understood in comparison to coverage control. Using the fact that the calibration step can be seen as an empirical volume minimization problem, we first derive a finite-sample upper-bound on the excess volume loss of the interval returned by the classical split method. This important quantity measures the difference in length between the interval obtained with the split method and the shortest oracle prediction interval. Then, we introduce EffOrt, a methodology that modifies the learning step so that the base prediction function is selected in order to minimize the length of the returned intervals. In particular, our theoretical analysis of the excess volume loss of the prediction sets produced by EffOrt reveals the links between the learning and calibration steps, and notably the impact of the choice of the function class of the base predictor. We also introduce Ad-EffOrt, an extension of the previous method, which produces intervals whose size adapts to the value of the covariate. Finally, we evaluate the empirical performance and the robustness of our methodologies.
