Table of Contents
Fetching ...

On the Expected Size of Conformal Prediction Sets

Guneet S. Dhillon, George Deligiannidis, Tom Rainforth

TL;DR

This paper tackles the finite-sample estimation of the expected size of split conformal prediction sets, a key practical metric alongside error control. It derives a theoretical quantification linking the expected set size to calibration-score distributions via $\mathbb{E}[|\hat{C}^{R}_{\alpha}(X_{n+1};Z_{1:n})|] = \int_{\mathcal{R}} \mathbb{P}\{\tau_{\alpha}(R_{1:n})\ge r\}\#_{R}(r)\,dr$, and introduces practical point and interval estimators that require only a single data collection. The methods handle both known and unknown multiplicative factors $\#_{R}$, using empirical calibration-score distributions and nested Monte Carlo to produce reliable estimates and high-probability bounds. Experiments on real-world regression and classification tasks demonstrate that these estimates closely track Monte Carlo baselines and provide informative intervals, enabling practitioners to assess expected set sizes without extensive data reuse or repeated conformal runs.

Abstract

While conformal predictors reap the benefits of rigorous statistical guarantees on their error frequency, the size of their corresponding prediction sets is critical to their practical utility. Unfortunately, there is currently a lack of finite-sample analysis and guarantees for their prediction set sizes. To address this shortfall, we theoretically quantify the expected size of the prediction sets under the split conformal prediction framework. As this precise formulation cannot usually be calculated directly, we further derive point estimates and high-probability interval bounds that can be empirically computed, providing a practical method for characterizing the expected set size. We corroborate the efficacy of our results with experiments on real-world datasets for both regression and classification problems.

On the Expected Size of Conformal Prediction Sets

TL;DR

This paper tackles the finite-sample estimation of the expected size of split conformal prediction sets, a key practical metric alongside error control. It derives a theoretical quantification linking the expected set size to calibration-score distributions via , and introduces practical point and interval estimators that require only a single data collection. The methods handle both known and unknown multiplicative factors , using empirical calibration-score distributions and nested Monte Carlo to produce reliable estimates and high-probability bounds. Experiments on real-world regression and classification tasks demonstrate that these estimates closely track Monte Carlo baselines and provide informative intervals, enabling practitioners to assess expected set sizes without extensive data reuse or repeated conformal runs.

Abstract

While conformal predictors reap the benefits of rigorous statistical guarantees on their error frequency, the size of their corresponding prediction sets is critical to their practical utility. Unfortunately, there is currently a lack of finite-sample analysis and guarantees for their prediction set sizes. To address this shortfall, we theoretically quantify the expected size of the prediction sets under the split conformal prediction framework. As this precise formulation cannot usually be calculated directly, we further derive point estimates and high-probability interval bounds that can be empirically computed, providing a practical method for characterizing the expected set size. We corroborate the efficacy of our results with experiments on real-world datasets for both regression and classification problems.
Paper Structure (40 sections, 4 theorems, 38 equations, 3 figures, 9 tables)

This paper contains 40 sections, 4 theorems, 38 equations, 3 figures, 9 tables.

Key Result

Theorem 1

If the test and the calibration non-conformity scores are independent of each other, then the expected size of the split conformal prediction sets is given by equation:size-quantification-non-iid. Furthermore, if the calibration non-conformity scores are i.i.d., then the expected size is given by eq

Figures (3)

  • Figure 1: Expected prediction set sizes conditioned on the test datum feature. We illustrate the expected sizes of split conformal prediction sets conditioned on varying test datum features using different UCI datasets. We use CQR romano2019conformalized for regression in the top row and APS romano2020classification for classification in the bottom row. The estimates are obtained via Monte Carlo averaging and our point estimates (refer to the legend for the color scheme); they are depicted as a histogram with side-by-side bars for comparison.
  • Figure 2: Expected prediction set sizes conditioned on the test datum feature. We illustrate the expected sizes of split conformal prediction sets conditioned on varying test datum features using different non-conformity scores (rows) and UCI datasets (columns). The estimates are obtained via Monte Carlo averaging and our point estimates (refer to the legend for the color scheme); they are depicted as a histogram with side-by-side bars.
  • Figure 3: Marginal expected prediction set sizes (synthetic example). We plot the theoretically expected prediction set sizes (cf. \ref{['equation:size-quantification-iid']}) on the x-axis vs. its empirical estimates on the y-axis. These estimates include: (i) the Monte Carlo average (solid black line), (ii) our point estimate from \ref{['subsection:size-estimation-known-multiplicative-factor']} (green line), and (iii) our upper-lower confidence bounds from \ref{['corollary:size-confidence-interval']} (orange/blue lines, changing with $\gamma$ as per the legend). $\alpha$ is set to 0.1 and the results are averaged to a line plot over different $a$ and $b$ values (bands denote the standard deviations). Additionally, the dashed black line is the identity line. The number of calibration data points $n$ increases from left to right. The size of the space of non-conformity scores $m$ increases from top to bottom.

Theorems & Definitions (8)

  • Theorem 1: Expected size of prediction sets
  • Corollary 2: Expected size of prediction sets conditioned on the test datum feature
  • Corollary 3: Confidence interval for the expected prediction set size
  • proof
  • proof
  • proof
  • Corollary 4: Expected size of prediction sets conditioned on the calibration data
  • proof