Conditional validity of heteroskedastic conformal regression
Nicolas Dewolf, Bernard De Baets, Willem Waegeman
TL;DR
The paper addresses conditional validity in conformal regression under heteroskedastic noise by analyzing and comparing inductive (split) conformal prediction, normalized conformal prediction (NCP), and Mondrian conformal predictors (MCP). It develops theoretical links between conditional validity and pivotal quantities, showing that normalization can yield conditional guarantees under location-scale families, and that MCP delivers class-wise conditional validity when data are partitioned by a taxonomy based on uncertainty. Through synthetic and real-data experiments, the authors demonstrate that MCP provides more stable conditional coverage near the target level, while marginal methods may under- or over-cover in regions of higher uncertainty. The work advances practical uncertainty quantification by enabling adaptive, conditionally valid prediction sets in heteroskedastic regression and offers diagnostic tools to assess conditional performance.
Abstract
Conformal prediction, and split conformal prediction as a specific implementation, offer a distribution-free approach to estimating prediction intervals with statistical guarantees. Recent work has shown that split conformal prediction can produce state-of-the-art prediction intervals when focusing on marginal coverage, i.e. on a calibration dataset the method produces on average prediction intervals that contain the ground truth with a predefined coverage level. However, such intervals are often not adaptive, which can be problematic for regression problems with heteroskedastic noise. This paper tries to shed new light on how prediction intervals can be constructed, using methods such as normalized and Mondrian conformal prediction, in such a way that they adapt to the heteroskedasticity of the underlying process. Theoretical and experimental results are presented in which these methods are compared in a systematic way. In particular, it is shown how the conditional validity of a chosen conformal predictor can be related to (implicit) assumptions about the data-generating distribution.
