Beyond Data Splitting: Full-Data Conformal Prediction by Differential Privacy

Young Hyun Cho; Jordan Awan

Beyond Data Splitting: Full-Data Conformal Prediction by Differential Privacy

Young Hyun Cho, Jordan Awan

TL;DR

A full-data privacy-preserving conformal prediction framework that avoids splitting is proposed, which leverages stability induced by differential privacy to control the gap between in-sample and out-of-sample conformal scores, and pairs this with a conservative private quantile routine designed to prevent under-coverage.

Abstract

Privacy protection and uncertainty quantification are increasingly important in data-driven decision making. Conformal prediction provides finite-sample marginal coverage, but existing private approaches often rely on data splitting, reducing the effective sample size. We propose a full-data privacy-preserving conformal prediction framework that avoids splitting. Our framework leverages stability induced by differential privacy to control the gap between in-sample and out-of-sample conformal scores, and pairs this with a conservative private quantile routine designed to prevent under-coverage. We show that a generic differential privacy guarantee yields a universal coverage floor, yet cannot generally recover the nominal $1-α$ level. We then provide a refined, mechanism-specific stability analysis and yields asymptotic recovery of the nominal level. Experiments demonstrate sharper prediction sets than the split-based private baseline.

Beyond Data Splitting: Full-Data Conformal Prediction by Differential Privacy

TL;DR

Abstract

level. We then provide a refined, mechanism-specific stability analysis and yields asymptotic recovery of the nominal level. Experiments demonstrate sharper prediction sets than the split-based private baseline.

Paper Structure (50 sections, 12 theorems, 190 equations, 2 figures, 11 tables, 4 algorithms)

This paper contains 50 sections, 12 theorems, 190 equations, 2 figures, 11 tables, 4 algorithms.

Introduction
Our Contributions
Related Studies
Paper Organization
Background and Motivation
Differential Privacy
Conformal Prediction
Why Full-Data Use Matters under Privacy
Proposed Framework
Overall Procedure
Conservative Differentially Private Quantile Estimation
Privacy Analysis
Coverage Analysis
A Universal Coverage Guarantee from DP and Its Limitation
Refined Coverage Guarantee with Further Assumptions
...and 35 more sections

Key Result

Proposition 1

Let $\{S_i\}_{i=1}^{k}$ be exchangeable scores, and let $\hat{q}$ be the $\lceil(1-\alpha)k\rceil$-th order statistics. Then, for any $S_i$, we have $\mathbb{P}(S_i \le \hat{q}) \ge 1 - \alpha.$

Figures (2)

Figure 1: Conceptual illustration of the distributional shift. The top row represents the ideal "exchangeable" world where $\theta_{n+1}$ is trained on all data points including the test point. The bottom row represents the actual "non-exchangeable" world where $\theta_n$ is used; the test score $S_{n+1}^{(n)}$ (red) is an out-of-sample evaluation. DP acts as a stabiliser, bounding the distance between $\theta_{n+1}$ and $\theta_n$, thereby ensuring that the red box remains distributionally close to the blue box.
Figure S1: Trajectory stability vs. estimation error under synchronized coupling. Each panel corresponds to $\epsilon\in\{0.5,1,2\}$ (with $\delta=10^{-5}$), reporting the mean over $R=30$ runs with a shaded uncertainty band.

Theorems & Definitions (33)

Definition 1: $f$-DP dong2022gaussian
Definition 2: Exchangeability
Proposition 1: vovk2005algorithmic
Lemma 1: One-sided conservativeness of Algorithm \ref{['alg:dp_binary_search']}
Lemma 2: Privacy of Buffered Binary Search
Theorem 1: Overall Privacy Guarantee
Theorem 2
Remark 1: Proof sketch and insight
Corollary 1: Black-box $f$-DP floor for DP-SCP
Example 1: Regression example
...and 23 more

Beyond Data Splitting: Full-Data Conformal Prediction by Differential Privacy

TL;DR

Abstract

Beyond Data Splitting: Full-Data Conformal Prediction by Differential Privacy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (33)