Table of Contents
Fetching ...

On Training-Conditional Conformal Prediction and Binomial Proportion Confidence Intervals

Rudi Coppola, Manuel Mazo

TL;DR

This work critically evaluates training-conditional conformal prediction (CP) in the context of Binomial Proportion Confidence Intervals (BPCI) for safety-critical settings. It shows that CP’s PAC-style guarantees pertain to the probability that a new sample falls inside a predicted set, not to directly estimating the Bernoulli mean $b$, and that for a Bernoulli specialization the guarantees can degenerate to trivial, uninformative predictions. Through a special-case analysis with an indicator nonconformity, the authors demonstrate that CP fails to provide nontrivial confidence intervals for $b$, unlike traditional BPCI methods such as Clopper-Pearson. They also argue that several recent safety-verification works built on CP are incorrect in their interpretation of guarantees and emphasize that CP remains valuable for set-valued uncertainty quantification rather than interval estimation of a Bernoulli parameter. The paper suggests future directions to develop CP formulations that are genuinely applicable to BPCI-type problems in statistical safety certification.

Abstract

Estimating the expectation of a Bernoulli random variable based on N independent trials is a classical problem in statistics, typically addressed using Binomial Proportion Confidence Intervals (BPCI). In the control systems community, many critical tasks-such as certifying the statistical safety of dynamical systems-can be formulated as BPCI problems. Conformal Prediction (CP), a distribution-free technique for uncertainty quantification, has gained significant attention in recent years and has been applied to various control systems problems, particularly to address uncertainties in learned dynamics or controllers. A variant known as training-conditional CP was recently employed to tackle the problem of safety certification. In this note, we highlight that the use of training-conditional CP in this context does not provide valid safety guarantees. We demonstrate why CP is unsuitable for BPCI problems and argue that traditional BPCI methods are better suited for statistical safety certification.

On Training-Conditional Conformal Prediction and Binomial Proportion Confidence Intervals

TL;DR

This work critically evaluates training-conditional conformal prediction (CP) in the context of Binomial Proportion Confidence Intervals (BPCI) for safety-critical settings. It shows that CP’s PAC-style guarantees pertain to the probability that a new sample falls inside a predicted set, not to directly estimating the Bernoulli mean , and that for a Bernoulli specialization the guarantees can degenerate to trivial, uninformative predictions. Through a special-case analysis with an indicator nonconformity, the authors demonstrate that CP fails to provide nontrivial confidence intervals for , unlike traditional BPCI methods such as Clopper-Pearson. They also argue that several recent safety-verification works built on CP are incorrect in their interpretation of guarantees and emphasize that CP remains valuable for set-valued uncertainty quantification rather than interval estimation of a Bernoulli parameter. The paper suggests future directions to develop CP formulations that are genuinely applicable to BPCI-type problems in statistical safety certification.

Abstract

Estimating the expectation of a Bernoulli random variable based on N independent trials is a classical problem in statistics, typically addressed using Binomial Proportion Confidence Intervals (BPCI). In the control systems community, many critical tasks-such as certifying the statistical safety of dynamical systems-can be formulated as BPCI problems. Conformal Prediction (CP), a distribution-free technique for uncertainty quantification, has gained significant attention in recent years and has been applied to various control systems problems, particularly to address uncertainties in learned dynamics or controllers. A variant known as training-conditional CP was recently employed to tackle the problem of safety certification. In this note, we highlight that the use of training-conditional CP in this context does not provide valid safety guarantees. We demonstrate why CP is unsuitable for BPCI problems and argue that traditional BPCI methods are better suited for statistical safety certification.

Paper Structure

This paper contains 7 sections, 1 theorem, 15 equations, 2 figures.

Key Result

Theorem 1

Choose $\epsilon,E\in[0,1]$A brief note on the notation. In the original formulation of CP $\epsilon$ has a double role: it is the significance level (appearing as the index to the INP $\Gamma^{\epsilon}$) and it describes the coverage probability as $1-\epsilon$, see shafer2008tutorial for details. in the $\sigma$-algebra $\mathcal{F}^N$ of the product probability space $(\mathbf{Z}^N, \mathcal{F

Figures (2)

  • Figure 1: On the left, a representation of the product space $\mathbf{Z}^2=\mathbf{Z}\times\mathbf{Z}$, partitioned accordingly to the sets $Q$ and $\overline{Q}$, and a hypothetical calibration set $(z_1,z_2)$ as in Case 1. On the right, a summary of Case 1, 2 and 3. On the $x$-,$y$-,$z$-axes are represented the values of $\epsilon$, the prediction (or support) of the INP, and the probability mass function respectively. For any given $\epsilon$, the INP $\Gamma^{\epsilon}$ can be viewed as a discrete random variable with support $Q$, $\overline{Q}$ and $\mathbf{Z}$. In the figure, for $\epsilon=0.8$ and $b=0.3$, the INP predicts $Q$ with probability 0, $\mathbf{Z}$ with probability $b^2$, and $\overline{Q}$ with probability $1-b^2$.
  • Figure 2: On the left the curves resulting from $b_{2,q} > E_q$, on the right the curves resulting from $b_{1,q}\leq E_q$, for $q=0,...,98$.

Theorems & Definitions (3)

  • Theorem 1: vovk2012conditional
  • Example 1
  • Remark 1