Table of Contents
Fetching ...

Conformal Prediction under Levy-Prokhorov Distribution Shifts: Robustness to Local and Global Perturbations

Liviu Aolaritei, Zheyu Oliver Wang, Julie Zhu, Michael I. Jordan, Youssef Marzouk

TL;DR

This work addresses the challenge of conformal prediction under distribution shifts by modeling uncertainty with Lévy–Prokhorov (LP) ambiguity sets around the training distribution. By propagating LP shifts through the nonconformity scoring function, the authors reduce complex input–label perturbations to a one-dimensional shift in score space and derive closed-form worst-case quantiles and coverage. They then construct distributionally robust conformal prediction sets with explicit dependence on LP parameters, and prove finite-sample guarantees that degrade gracefully with global perturbations while local perturbations adjust interval width. Empirical results on MNIST, ImageNet, and iWildCam demonstrate valid coverage and competitive set sizes under both synthetic and real-world shifts, with a data-driven procedure to estimate LP parameters from data. The approach offers a principled, hypothesis-light framework for robust prediction intervals that do not rely on likelihood ratios or absolute continuity and can accommodate broad, combined local-global distribution shifts.

Abstract

Conformal prediction provides a powerful framework for constructing prediction intervals with finite-sample guarantees, yet its robustness under distribution shifts remains a significant challenge. This paper addresses this limitation by modeling distribution shifts using Levy-Prokhorov (LP) ambiguity sets, which capture both local and global perturbations. We provide a self-contained overview of LP ambiguity sets and their connections to popular metrics such as Wasserstein and Total Variation. We show that the link between conformal prediction and LP ambiguity sets is a natural one: by propagating the LP ambiguity set through the scoring function, we reduce complex high-dimensional distribution shifts to manageable one-dimensional distribution shifts, enabling exact quantification of worst-case quantiles and coverage. Building on this analysis, we construct robust conformal prediction intervals that remain valid under distribution shifts, explicitly linking LP parameters to interval width and confidence levels. Experimental results on real-world datasets demonstrate the effectiveness of the proposed approach.

Conformal Prediction under Levy-Prokhorov Distribution Shifts: Robustness to Local and Global Perturbations

TL;DR

This work addresses the challenge of conformal prediction under distribution shifts by modeling uncertainty with Lévy–Prokhorov (LP) ambiguity sets around the training distribution. By propagating LP shifts through the nonconformity scoring function, the authors reduce complex input–label perturbations to a one-dimensional shift in score space and derive closed-form worst-case quantiles and coverage. They then construct distributionally robust conformal prediction sets with explicit dependence on LP parameters, and prove finite-sample guarantees that degrade gracefully with global perturbations while local perturbations adjust interval width. Empirical results on MNIST, ImageNet, and iWildCam demonstrate valid coverage and competitive set sizes under both synthetic and real-world shifts, with a data-driven procedure to estimate LP parameters from data. The approach offers a principled, hypothesis-light framework for robust prediction intervals that do not rely on likelihood ratios or absolute continuity and can accommodate broad, combined local-global distribution shifts.

Abstract

Conformal prediction provides a powerful framework for constructing prediction intervals with finite-sample guarantees, yet its robustness under distribution shifts remains a significant challenge. This paper addresses this limitation by modeling distribution shifts using Levy-Prokhorov (LP) ambiguity sets, which capture both local and global perturbations. We provide a self-contained overview of LP ambiguity sets and their connections to popular metrics such as Wasserstein and Total Variation. We show that the link between conformal prediction and LP ambiguity sets is a natural one: by propagating the LP ambiguity set through the scoring function, we reduce complex high-dimensional distribution shifts to manageable one-dimensional distribution shifts, enabling exact quantification of worst-case quantiles and coverage. Building on this analysis, we construct robust conformal prediction intervals that remain valid under distribution shifts, explicitly linking LP parameters to interval width and confidence levels. Experimental results on real-world datasets demonstrate the effectiveness of the proposed approach.

Paper Structure

This paper contains 16 sections, 9 theorems, 55 equations, 6 figures, 1 algorithm.

Key Result

Proposition 2.1

The LP ambiguity set can be equivalently rewritten as

Figures (6)

  • Figure 1: (Left) Worst-case quantile; (Right) Worst-case coverage.
  • Figure 2: Score distribution shift. Plots for MNIST and ImageNet under ($p=0.05, u=1.0$) perturbation. The score distribution obtained from the unperturbed data (red), and from the perturbed data (blue) are plotted in log scale. For ImageNet, we removed 18 negative-valued outliers ranging from -5.5 to -10 for visualization purposes.
  • Figure 3: MNIST and ImageNet. Coverage (validity) and size (efficiency). In the coverage plots, the long dashed line indicates the target $1 - \alpha$ level. Scattered points show empirical coverage and prediction set size for each calibration–test split, while short horizontal lines denote averages across $M = 30$ splits. The proposed methods are highlighted in bold/red.
  • Figure 4: iWildCam. Coverage (left) and prediction set size (right) over a range of $(\varepsilon,\rho)$ values. The white dashed line denotes the set of $(\varepsilon, \rho)$ pairs achieving exactly 90% empirical coverage. White circles correspond to points estimated by the algorithm in Appendix \ref{['sec:estimation:rho:vareps']}, and the best-performing pair among them (yielding the smallest prediction set) is marked by a black diamond. For comparison, the smallest prediction set along the 90% coverage frontier is shown with a black circle.
  • Figure 5: ImageNet $(\varepsilon, \rho)$ estimation. Each point in the 20-point grid corresponds to a candidate $(\varepsilon, \rho)$ pair, where $\varepsilon \in (0.5, 1.5)$ and $\rho$ is estimated using one-dimensional optimal transport between the empirical calibration and test score distributions, each constructed from 1000 samples. The color scale represents the empirical worst-case quantile associated with each pair, computed on a held-out calibration batch. The optimal $(\varepsilon, \rho)$ pair, yielding the smallest quantile, is highlighted in red, with the corresponding empirical coverage and prediction set size annotated. The true corruption parameters $(p, u)$ used to generate the test distribution are also indicated for reference.
  • ...and 1 more figures

Theorems & Definitions (24)

  • Proposition 2.1: Decomposition of the LP ambiguity set
  • Corollary 2.2: Relationship to other metrics
  • Proposition 2.3: Local and Global Perturbation
  • Remark 2.4: Absolute continuity
  • Proposition 2.5: Propagation of the LP ambiguity set
  • Remark 2.6: Lipschitzness of the score function
  • Definition 3.1: Worst-case quantile
  • Definition 3.2: Worst-case coverage
  • Remark 3.3: Case $\rho\geq 1 - \beta$
  • Proposition 3.4: Worst-case quantile in the LP ambiguity set
  • ...and 14 more