Leverage-Weighted Conformal Prediction

Shreyas Fadnavis

Leverage-Weighted Conformal Prediction

Shreyas Fadnavis

TL;DR

LWCP is proved that LWCP preserves finite-sample marginal validity for any weight function; achieves asymptotically optimal conditional coverage at essentially no width cost when heteroscedasticity factors through leverage; and recovers the form and width of classical prediction intervals under Gaussian assumptions while retaining distribution-free guarantees.

Abstract

Split conformal prediction provides distribution-free prediction intervals with finite-sample marginal coverage, but produces constant-width intervals that overcover in low-variance regions and undercover in high-variance regions. Existing adaptive methods require training auxiliary models. We propose Leverage-Weighted Conformal Prediction (LWCP), which weights nonconformity scores by a function of the statistical leverage -- the diagonal of the hat matrix -- deriving adaptivity from the geometry of the design matrix rather than from auxiliary model fitting. We prove that LWCP preserves finite-sample marginal validity for any weight function; achieves asymptotically optimal conditional coverage at essentially no width cost when heteroscedasticity factors through leverage; and recovers the form and width of classical prediction intervals under Gaussian assumptions while retaining distribution-free guarantees. We further establish that randomized leverage approximations preserve coverage exactly with controlled width perturbation, and that vanilla CP suffers a persistent, sample-size-independent conditional coverage gap that LWCP eliminates. The method requires no hyperparameters beyond the choice of weight function and adds negligible computational overhead to vanilla CP. Experiments on synthetic and real data confirm the theoretical predictions, demonstrating substantial reductions in conditional coverage disparity across settings.

Leverage-Weighted Conformal Prediction

TL;DR

Abstract

Paper Structure (79 sections, 31 theorems, 35 equations, 12 figures, 24 tables, 1 algorithm)

This paper contains 79 sections, 31 theorems, 35 equations, 12 figures, 24 tables, 1 algorithm.

Introduction
Contributions.
Related Work
Conformal prediction.
Adaptive and localized methods.
Leverage scores.
Leverage-Weighted Conformal Prediction
Setup.
Canonical weighting functions.
Computational cost.
LWCP+: Combining leverage with residual scale estimation.
Theoretical Results
Finite-Sample Marginal Coverage
Efficiency Under Heteroscedasticity
Classical Recovery
...and 64 more sections

Key Result

Theorem 3.1

Let $(X_i, Y_i)$, $i = 1, \ldots, n+1$, be exchangeable. For any predictor $\hat{f}$ trained on $\mathcal{D}_1$ and any measurable $w : [0,\infty) \to \mathbb{R}_+$, the LWCP interval eq:lwcp_interval satisfies $\mathbb{P}(Y_{n+1} \in \hat{\mathcal{C}}_n^w(X_{n+1}) \mid \mathcal{D}_1) \geq 1 - \alph

Figures (12)

Figure 1: Conditional coverage by leverage decile (200 replications). Vanilla CP (gray) exhibits monotone undercoverage at high leverage. LWCP (red) achieves approximately flat conditional coverage across all DGPs, with the largest improvement under homoscedastic errors ($p/n_1 = 0.3$) where $(1{+}h)^{-1/2}$ exactly stabilizes the prediction variance.
Figure 2: Conditional coverage across methods. LWCP (red) achieves the flattest coverage profile at the lowest computational cost. CQR (orange) attains comparable flatness but with substantially wider intervals. Studentized CP (green) achieves moderate improvement at significantly higher runtime.
Figure 3: Recovery of classical Gaussian prediction intervals (\ref{['thm:gaussian_recovery']}). (a) The LWCP/classical width ratio converges to 1.0 at rate $O(1/\sqrt{n})$. (b) At $n = 2{,}000$, LWCP and classical widths are visually indistinguishable.
Figure 4: Heteroscedasticity sensitivity. LWCP improves when $g(h)$ is leverage-dependent, is harmless when $g=1$, and provides minimal benefit under adversarial $\mathop{\mathrm{Var}}\nolimits \propto \|X\|^2$.
Figure 5: Interval width vs. leverage. Vanilla CP (gray) produces constant-width intervals. LWCP (red) adapts following the $\sqrt{1+h}$ scaling.
...and 7 more figures

Theorems & Definitions (91)

Definition 2.1: Leverage-weighted nonconformity score
Definition 2.2: LWCP prediction interval
Theorem 3.1: Marginal coverage
proof
Theorem 3.4: Efficiency of variance-stabilized LWCP
proof : Proof sketch
Theorem 3.5: Non-asymptotic width parity
proof : Proof sketch
Proposition 3.6: Conditional coverage gap
proof : Proof sketch
...and 81 more

Leverage-Weighted Conformal Prediction

TL;DR

Abstract

Leverage-Weighted Conformal Prediction

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (91)