Table of Contents
Fetching ...

Leverage-Weighted Conformal Prediction

Shreyas Fadnavis

TL;DR

LWCP is proved that LWCP preserves finite-sample marginal validity for any weight function; achieves asymptotically optimal conditional coverage at essentially no width cost when heteroscedasticity factors through leverage; and recovers the form and width of classical prediction intervals under Gaussian assumptions while retaining distribution-free guarantees.

Abstract

Split conformal prediction provides distribution-free prediction intervals with finite-sample marginal coverage, but produces constant-width intervals that overcover in low-variance regions and undercover in high-variance regions. Existing adaptive methods require training auxiliary models. We propose Leverage-Weighted Conformal Prediction (LWCP), which weights nonconformity scores by a function of the statistical leverage -- the diagonal of the hat matrix -- deriving adaptivity from the geometry of the design matrix rather than from auxiliary model fitting. We prove that LWCP preserves finite-sample marginal validity for any weight function; achieves asymptotically optimal conditional coverage at essentially no width cost when heteroscedasticity factors through leverage; and recovers the form and width of classical prediction intervals under Gaussian assumptions while retaining distribution-free guarantees. We further establish that randomized leverage approximations preserve coverage exactly with controlled width perturbation, and that vanilla CP suffers a persistent, sample-size-independent conditional coverage gap that LWCP eliminates. The method requires no hyperparameters beyond the choice of weight function and adds negligible computational overhead to vanilla CP. Experiments on synthetic and real data confirm the theoretical predictions, demonstrating substantial reductions in conditional coverage disparity across settings.

Leverage-Weighted Conformal Prediction

TL;DR

LWCP is proved that LWCP preserves finite-sample marginal validity for any weight function; achieves asymptotically optimal conditional coverage at essentially no width cost when heteroscedasticity factors through leverage; and recovers the form and width of classical prediction intervals under Gaussian assumptions while retaining distribution-free guarantees.

Abstract

Split conformal prediction provides distribution-free prediction intervals with finite-sample marginal coverage, but produces constant-width intervals that overcover in low-variance regions and undercover in high-variance regions. Existing adaptive methods require training auxiliary models. We propose Leverage-Weighted Conformal Prediction (LWCP), which weights nonconformity scores by a function of the statistical leverage -- the diagonal of the hat matrix -- deriving adaptivity from the geometry of the design matrix rather than from auxiliary model fitting. We prove that LWCP preserves finite-sample marginal validity for any weight function; achieves asymptotically optimal conditional coverage at essentially no width cost when heteroscedasticity factors through leverage; and recovers the form and width of classical prediction intervals under Gaussian assumptions while retaining distribution-free guarantees. We further establish that randomized leverage approximations preserve coverage exactly with controlled width perturbation, and that vanilla CP suffers a persistent, sample-size-independent conditional coverage gap that LWCP eliminates. The method requires no hyperparameters beyond the choice of weight function and adds negligible computational overhead to vanilla CP. Experiments on synthetic and real data confirm the theoretical predictions, demonstrating substantial reductions in conditional coverage disparity across settings.
Paper Structure (79 sections, 31 theorems, 35 equations, 12 figures, 24 tables, 1 algorithm)

This paper contains 79 sections, 31 theorems, 35 equations, 12 figures, 24 tables, 1 algorithm.

Key Result

Theorem 3.1

Let $(X_i, Y_i)$, $i = 1, \ldots, n+1$, be exchangeable. For any predictor $\hat{f}$ trained on $\mathcal{D}_1$ and any measurable $w : [0,\infty) \to \mathbb{R}_+$, the LWCP interval eq:lwcp_interval satisfies $\mathbb{P}(Y_{n+1} \in \hat{\mathcal{C}}_n^w(X_{n+1}) \mid \mathcal{D}_1) \geq 1 - \alph

Figures (12)

  • Figure 1: Conditional coverage by leverage decile (200 replications). Vanilla CP (gray) exhibits monotone undercoverage at high leverage. LWCP (red) achieves approximately flat conditional coverage across all DGPs, with the largest improvement under homoscedastic errors ($p/n_1 = 0.3$) where $(1{+}h)^{-1/2}$ exactly stabilizes the prediction variance.
  • Figure 2: Conditional coverage across methods. LWCP (red) achieves the flattest coverage profile at the lowest computational cost. CQR (orange) attains comparable flatness but with substantially wider intervals. Studentized CP (green) achieves moderate improvement at significantly higher runtime.
  • Figure 3: Recovery of classical Gaussian prediction intervals (\ref{['thm:gaussian_recovery']}). (a) The LWCP/classical width ratio converges to 1.0 at rate $O(1/\sqrt{n})$. (b) At $n = 2{,}000$, LWCP and classical widths are visually indistinguishable.
  • Figure 4: Heteroscedasticity sensitivity. LWCP improves when $g(h)$ is leverage-dependent, is harmless when $g=1$, and provides minimal benefit under adversarial $\mathop{\mathrm{Var}}\nolimits \propto \|X\|^2$.
  • Figure 5: Interval width vs. leverage. Vanilla CP (gray) produces constant-width intervals. LWCP (red) adapts following the $\sqrt{1+h}$ scaling.
  • ...and 7 more figures

Theorems & Definitions (91)

  • Definition 2.1: Leverage-weighted nonconformity score
  • Definition 2.2: LWCP prediction interval
  • Theorem 3.1: Marginal coverage
  • proof
  • Theorem 3.4: Efficiency of variance-stabilized LWCP
  • proof : Proof sketch
  • Theorem 3.5: Non-asymptotic width parity
  • proof : Proof sketch
  • Proposition 3.6: Conditional coverage gap
  • proof : Proof sketch
  • ...and 81 more