Table of Contents
Fetching ...

Shape-Adaptive Conditional Calibration for Conformal Prediction via Minimax Optimization

Yajie Bao, Chuchen Zhang, Zhaojun Wang, Haojie Ren, Changliang Zou

Abstract

Achieving valid conditional coverage in conformal prediction is challenging due to the theoretical difficulty of satisfying pointwise constraints in finite samples. Building upon the characterization of conditional coverage through marginal moment restrictions, we introduce Minimax Optimization Predictive Inference (MOPI), a framework that generalizes prior work by optimizing over a flexible class of set-valued mappings during the calibration phase, rather than simply calibrating a fixed sublevel set. This minimax formulation effectively circumvents the structural constraints of predefined score functions, achieving superior shape adaptivity while maintaining a principled connection to the minimization of mean squared coverage error. Theoretically, we provide non-asymptotic oracle inequalities and show that the convergence rate of the coverage error attains the optimal order under regular conditions. The MOPI also enables valid inference conditional on sensitive attributes that are available during calibration but unobserved at test time. Empirical results on complex, non-standard conditional distributions demonstrate that MOPI produces more efficient prediction sets than existing baselines.

Shape-Adaptive Conditional Calibration for Conformal Prediction via Minimax Optimization

Abstract

Achieving valid conditional coverage in conformal prediction is challenging due to the theoretical difficulty of satisfying pointwise constraints in finite samples. Building upon the characterization of conditional coverage through marginal moment restrictions, we introduce Minimax Optimization Predictive Inference (MOPI), a framework that generalizes prior work by optimizing over a flexible class of set-valued mappings during the calibration phase, rather than simply calibrating a fixed sublevel set. This minimax formulation effectively circumvents the structural constraints of predefined score functions, achieving superior shape adaptivity while maintaining a principled connection to the minimization of mean squared coverage error. Theoretically, we provide non-asymptotic oracle inequalities and show that the convergence rate of the coverage error attains the optimal order under regular conditions. The MOPI also enables valid inference conditional on sensitive attributes that are available during calibration but unobserved at test time. Empirical results on complex, non-standard conditional distributions demonstrate that MOPI produces more efficient prediction sets than existing baselines.
Paper Structure (58 sections, 32 theorems, 232 equations, 16 figures, 4 tables)

This paper contains 58 sections, 32 theorems, 232 equations, 16 figures, 4 tables.

Key Result

Proposition 2.1

If $\alpha(\cdot;C)-\alpha \in {\mathcal{F}}$ for any $C\in \mathfrak{C}$, we have $\max_{f\in {\mathcal{F}}}\Psi(C,f) = \mathsf{MSCE}(C)/4$ for any $C\in {\mathfrak{C}}$ and $\mathsf{MSCE}(C^*) = \mathsf{MSCE}(C^{\rm ora})$.

Figures (16)

  • Figure 1: Conditional coverage metrics and log of set volumes (denoted by $\log|\widehat{C}|$) versus sample sizes of calibration set under ellipsoidal sets and $d_{{\mathcal{Y}}} = 2$. The error bars represent the standard deviation of the metrics over $100$ replications.
  • Figure 2: Group-conditional coverage rates and set sizes under box sets and $d_{{\mathcal{Y}}} = 2$. The error bars represent the standard deviation of the metrics over $100$ replications.
  • Figure 3: Marginal coverage rate and conditional coverage rates on sensitive attributes.
  • Figure 4: Root of MSCE and conditional coverage rates under Setting $1^{\prime}$ as $\rho^*$ varies. The error bars represent the standard deviation of the metrics over $100$ replications.
  • Figure 5: Coverage comparison for the Households dataset. The error bars (left) show the standard deviation, and the confidence bands (right) represent approximate $95\%$ normal confidence intervals, both computed over $100$ replications.
  • ...and 11 more figures

Theorems & Definitions (63)

  • Example 2.1: Test-conditional coverage
  • Example 2.2: Group-conditional coverage
  • Example 2.3: Equalized coverage on sensitive attributes
  • Remark 2.1
  • Proposition 2.1
  • Remark 2.2
  • Remark 2.3
  • Lemma 3.1
  • Lemma 3.2
  • Theorem 4.1: Oracle inequality
  • ...and 53 more