Table of Contents
Fetching ...

Universal distribution of the empirical coverage in split conformal prediction

Paulo C. Marques F

TL;DR

This work addresses how to quantify the empirical coverage of split conformal prediction under exchangeability. It derives exact finite-sample and infinite-sample distributions: the finite-sample coverage follows a Beta-Binomial distribution with parameters $b=\lceil(1-\alpha)(n+1)\rceil$ and $g=\lfloor\alpha(n+1)\rfloor$, while the infinite-batch limit is distributed as $\text{Beta}(b,g)$. These distributions are universal, depending only on the nominal miscoverage level $\alpha$ and the calibration size $n$, enabling a principled calibration-size criterion for guaranteeing target coverage in the limit of infinite future observations. Practically, they enable practitioners to select the minimum calibration size $n$ to bound deviations of the empirical coverage from $1-\alpha$ with prescribed tolerance $\epsilon$ and probability $\tau$, as summarized in Table 1. Overall, the work connects exchangeability, de Finetti’s representation, and conformal prediction to provide distribution-free, actionable guarantees for prediction-interval validity.

Abstract

When split conformal prediction operates in batch mode with exchangeable data, we determine the exact distribution of the empirical coverage of prediction sets produced for a finite batch of future observables, as well as the exact distribution of its almost sure limit when the batch size goes to infinity. Both distributions are universal, being determined solely by the nominal miscoverage level and the calibration sample size, thereby establishing a criterion for choosing the minimum required calibration sample size in applications.

Universal distribution of the empirical coverage in split conformal prediction

TL;DR

This work addresses how to quantify the empirical coverage of split conformal prediction under exchangeability. It derives exact finite-sample and infinite-sample distributions: the finite-sample coverage follows a Beta-Binomial distribution with parameters and , while the infinite-batch limit is distributed as . These distributions are universal, depending only on the nominal miscoverage level and the calibration size , enabling a principled calibration-size criterion for guaranteeing target coverage in the limit of infinite future observations. Practically, they enable practitioners to select the minimum calibration size to bound deviations of the empirical coverage from with prescribed tolerance and probability , as summarized in Table 1. Overall, the work connects exchangeability, de Finetti’s representation, and conformal prediction to provide distribution-free, actionable guarantees for prediction-interval validity.

Abstract

When split conformal prediction operates in batch mode with exchangeable data, we determine the exact distribution of the empirical coverage of prediction sets produced for a finite batch of future observables, as well as the exact distribution of its almost sure limit when the batch size goes to infinity. Both distributions are universal, being determined solely by the nominal miscoverage level and the calibration sample size, thereby establishing a criterion for choosing the minimum required calibration sample size in applications.
Paper Structure (4 sections, 3 theorems, 18 equations, 1 table)

This paper contains 4 sections, 3 theorems, 18 equations, 1 table.

Key Result

Lemma 1

Under the data exchangeability assumption, the sequence of conformity scores $\{S_i\}_{i\geq 1}$ is exchangeable.

Theorems & Definitions (6)

  • Lemma 1
  • Theorem 1
  • Theorem 2
  • proof : Proof of Lemma \ref{['lmm:xchscr']}
  • proof : Proof of Theorem \ref{['thm:finite']}
  • proof : Proof of Theorem \ref{['thm:limit']}