Table of Contents
Fetching ...

KMM-CP: Practical Conformal Prediction under Covariate Shift via Selective Kernel Mean Matching

Siddhartha Laghuvarapu, Rohan Deb, Jimeng Sun

Abstract

Uncertainty quantification is essential for deploying machine learning models in high-stakes domains such as scientific discovery and healthcare. Conformal Prediction (CP) provides finite-sample coverage guarantees under exchangeability, an assumption often violated in practice due to distribution shift. Under covariate shift, restoring validity requires importance weighting, yet accurate density-ratio estimation becomes unstable when training and test distributions exhibit limited support overlap. We propose KMM-CP, a conformal prediction framework based on Kernel Mean Matching (KMM) for covariate-shift correction. We show that KMM directly controls the bias-variance components governing conformal coverage error by minimizing RKHS moment discrepancy under explicit weight constraints, and establish asymptotic coverage guarantees under mild conditions. We then introduce a selective extension that identifies regions of reliable support overlap and restricts conformal correction to this subset, further improving stability in low-overlap regimes. Experiments on molecular property prediction benchmarks with realistic distribution shifts show that KMM-CP reduces coverage gap by over 50% compared to existing approaches. The code is available at https://github.com/siddharthal/KMM-CP.

KMM-CP: Practical Conformal Prediction under Covariate Shift via Selective Kernel Mean Matching

Abstract

Uncertainty quantification is essential for deploying machine learning models in high-stakes domains such as scientific discovery and healthcare. Conformal Prediction (CP) provides finite-sample coverage guarantees under exchangeability, an assumption often violated in practice due to distribution shift. Under covariate shift, restoring validity requires importance weighting, yet accurate density-ratio estimation becomes unstable when training and test distributions exhibit limited support overlap. We propose KMM-CP, a conformal prediction framework based on Kernel Mean Matching (KMM) for covariate-shift correction. We show that KMM directly controls the bias-variance components governing conformal coverage error by minimizing RKHS moment discrepancy under explicit weight constraints, and establish asymptotic coverage guarantees under mild conditions. We then introduce a selective extension that identifies regions of reliable support overlap and restricts conformal correction to this subset, further improving stability in low-overlap regimes. Experiments on molecular property prediction benchmarks with realistic distribution shifts show that KMM-CP reduces coverage gap by over 50% compared to existing approaches. The code is available at https://github.com/siddharthal/KMM-CP.

Paper Structure

This paper contains 67 sections, 7 theorems, 61 equations, 6 figures, 14 tables, 2 algorithms.

Key Result

Theorem 1

Assume the covariate shift model eq:cal-dist--eq:test-dist and $P_X^{\text{test}} \ll P_X^{\text{cal}}$. Let $w(x)=\frac{dP_X^{\text{test}}}{dP_X^{\text{cal}}}(x)$, and construct the weighted conformal prediction set $\hat{C}$ using calibration nonconformity scores and the weighted-quantile rule of where the probability is over $(X_{N+1},Y_{N+1})\sim P^{\text{test}}$ (and the randomness in the ca

Figures (6)

  • Figure 1: Synthetic experiment: (a) Covariate shift with partial support overlap, the target includes regions poorly supported by the source. (b) Classifier-based density-ratio assigns heavy-tailed weights to low-support regions, reducing effective sample size (ESS). (c) Standard KMM enforces bounded weights and matches moments over the full domain, improving stability but still correcting unsupported regions. (d) Selective KMM jointly optimizes source weights and target selection variables, filtering unsupported regions and stabilizing weights. (e) Classifier weighting reduces ESS, KMM lowers moment discrepancy (MMD), and selective KMM achieves both stable ESS and improved moment matching.
  • Figure 2: Calibration curves under covariate shift (global calibration). The dashed line denotes nominal coverage. Uniform CP undercovers systematically, density-ratio baselines partially correct this, KMM improves alignment, and SKMM most closely tracks the target across datasets.
  • Figure 3: Bias--variance proxy ($\mathrm{MMD} + \sqrt{1/\mathrm{ESS}}$) under covariate shift. MAD in brackets.
  • Figure 4: Selection weights under SKMM.
  • Figure 5: Calibration coverage under global conformal prediction across datasets.
  • ...and 1 more figures

Theorems & Definitions (13)

  • Theorem 1: Coverage under covariate shift tibshirani2019
  • Remark 1
  • Theorem 2: CDF stability under MMD control
  • Theorem 3: Coverage error for KMM-weighted split conformal
  • Corollary 1: Asymptotic validity of KMM-weighted split conformal
  • proof
  • proof
  • Theorem 4: Conditional boundary rule under a yield constraint
  • proof
  • Corollary 2: Thresholding structure for SKMM target selection
  • ...and 3 more