Table of Contents
Fetching ...

Volume Optimality in Conformal Prediction with Structured Prediction Sets

Chao Gao, Liren Shan, Vaidehi Srinivas, Aravindan Vijayaraghavan

TL;DR

The paper tackles volume optimality in distribution-free conformal prediction by first proving an impossibility result for unrestricted volume minimization, then introducing restricted volume optimality within the class of unions of $k$ intervals, $\mathcal{C}_k$. It develops a dynamic-programming based conformity score (CP-DP) and shows how to conformalize it to obtain finite-sample coverage guarantees while achieving near-optimal volume within $\mathcal{C}_k$, including approximate conditional coverage when a conditional CDF estimator is available. In the supervised setting, it combines distributional conformal prediction (DCP) with DP to obtain marginal and (when feasible) approximate conditional coverage and conditional restricted volume optimality for $k$-interval prediction sets. Empirical results on synthetic data demonstrate that CP-DP substantially improves prediction-set volume over existing KDE- or density-based methods, particularly in multimodal scenarios, with polynomial-time DP algorithms facilitating scalable deployment.

Abstract

Conformal Prediction is a widely studied technique to construct prediction sets of future observations. Most conformal prediction methods focus on achieving the necessary coverage guarantees, but do not provide formal guarantees on the size (volume) of the prediction sets. We first prove an impossibility of volume optimality where any distribution-free method can only find a trivial solution. We then introduce a new notion of volume optimality by restricting the prediction sets to belong to a set family (of finite VC-dimension), specifically a union of $k$-intervals. Our main contribution is an efficient distribution-free algorithm based on dynamic programming (DP) to find a union of $k$-intervals that is guaranteed for any distribution to have near-optimal volume among all unions of $k$-intervals satisfying the desired coverage property. By adopting the framework of distributional conformal prediction (Chernozhukov et al., 2021), the new DP based conformity score can also be applied to achieve approximate conditional coverage and conditional restricted volume optimality, as long as a reasonable estimator of the conditional CDF is available. While the theoretical results already establish volume-optimality guarantees, they are complemented by experiments that demonstrate that our method can significantly outperform existing methods in many settings.

Volume Optimality in Conformal Prediction with Structured Prediction Sets

TL;DR

The paper tackles volume optimality in distribution-free conformal prediction by first proving an impossibility result for unrestricted volume minimization, then introducing restricted volume optimality within the class of unions of intervals, . It develops a dynamic-programming based conformity score (CP-DP) and shows how to conformalize it to obtain finite-sample coverage guarantees while achieving near-optimal volume within , including approximate conditional coverage when a conditional CDF estimator is available. In the supervised setting, it combines distributional conformal prediction (DCP) with DP to obtain marginal and (when feasible) approximate conditional coverage and conditional restricted volume optimality for -interval prediction sets. Empirical results on synthetic data demonstrate that CP-DP substantially improves prediction-set volume over existing KDE- or density-based methods, particularly in multimodal scenarios, with polynomial-time DP algorithms facilitating scalable deployment.

Abstract

Conformal Prediction is a widely studied technique to construct prediction sets of future observations. Most conformal prediction methods focus on achieving the necessary coverage guarantees, but do not provide formal guarantees on the size (volume) of the prediction sets. We first prove an impossibility of volume optimality where any distribution-free method can only find a trivial solution. We then introduce a new notion of volume optimality by restricting the prediction sets to belong to a set family (of finite VC-dimension), specifically a union of -intervals. Our main contribution is an efficient distribution-free algorithm based on dynamic programming (DP) to find a union of -intervals that is guaranteed for any distribution to have near-optimal volume among all unions of -intervals satisfying the desired coverage property. By adopting the framework of distributional conformal prediction (Chernozhukov et al., 2021), the new DP based conformity score can also be applied to achieve approximate conditional coverage and conditional restricted volume optimality, as long as a reasonable estimator of the conditional CDF is available. While the theoretical results already establish volume-optimality guarantees, they are complemented by experiments that demonstrate that our method can significantly outperform existing methods in many settings.

Paper Structure

This paper contains 45 sections, 8 theorems, 66 equations, 21 figures, 2 tables, 1 algorithm.

Key Result

Theorem 2.1

Consider observations $Y_1$, $Y_2$, $\dots$, $Y_n$, $Y_{n+1}$ sampled $i.i.d.$ from a distribution $P$ on $\mathbb{R}$. Suppose $\widehat{C}=\widehat{C}(Y_1,\cdots,Y_n)$ satisfies $\mathbb{P}(Y_{n+1}\in \widehat{C})\geq 1-\alpha$ for all distribution $P$. Then, for any $\varepsilon \in (0,\alpha)$,

Figures (21)

  • Figure 1: Conformal prediction sets on the mixture of Gaussians data from $P = \frac{1}{3}N(-6,0.0001)+\frac{1}{3}N(0,1)+\frac{1}{3}N(8,0.25)$. The coverage probability is $80\%$. The theoretically optimal volume is $3.0178$.
  • Figure 2: Results in the supervised setting on a synthetic data from romano2019conformalized for target coverage 0.7. The left plot shows the output of DCP-QR*, the state of the art method by chernozhukov2021distributional, which outputs prediction sets with average volume 1.29. The right plot shows the output of our method with $k = 5$ intervals, which achieves a significantly improved average volume of 0.45.
  • Figure 3: Results in the supervised setting on a synthetic data with $20$ dimensional feature from izbicki2020flexible for target coverage 0.7. The left plot shows the output of HPD-Split method by izbicki2022cd, with average volume $3.60$. The right plot shows the output of our method with $k=2$ intervals, which has an average volume $3.55$.
  • Figure 4: Conformal prediction sets on the Gaussian dataset. The left plot shows the histogram of the dataset and the prediction set produced by conformalized DP with $k=1$; the right plot shows the kernel density estimation with bandwidth $\rho=0.001$ and the prediction set given by the conformalized KDE.
  • Figure 5: Volumes of prediction sets of the two methods on the Gaussian dataset (blue) and the benchmark $\operatorname{OPT}_1(N(0,1),0.3)=0.7706$ (red). The blue curves are computed by averaging $100$ independent experiments.
  • ...and 16 more figures

Theorems & Definitions (19)

  • Theorem 2.1
  • Remark 2.2
  • Proposition 2.3
  • Theorem 2.5
  • Theorem 3.3
  • Remark B.1
  • Theorem C.2
  • Lemma C.3
  • proof : Proof of Theorem \ref{['thm:DPvsKDE']}
  • Lemma D.1
  • ...and 9 more