Table of Contents
Fetching ...

Conformal Counterfactual Inference under Hidden Confounding

Zonghao Chen, Ruocheng Guo, Jean-François Ton, Yang Liu

TL;DR

Addressing counterfactual inference under hidden confounding, the paper introduces wTCP-DR and its cheaper WSCP-DR variant, which use density-ratio estimation to reweight conformal scores and deliver valid marginal confidence intervals without requiring strong ignorability. The methods leverage both observational and a fraction of interventional data, learning $r(x,y)=p^I(x,y)/p^O(x,y)$ to correct distribution shift and provide finite-sample guarantees. Theoretical results quantify when these methods outperform naive interventional-only approaches, and case studies with additive Gaussian noise validate the gains. Empirical evaluations on synthetic and real-world recommender data show improved coverage and tighter intervals over baselines, including cases where interventional labels are scarce or unavailable, highlighting practical value for safe decision-making under confounding.

Abstract

Personalized decision making requires the knowledge of potential outcomes under different treatments, and confidence intervals about the potential outcomes further enrich this decision-making process and improve its reliability in high-stakes scenarios. Predicting potential outcomes along with its uncertainty in a counterfactual world poses the foundamental challenge in causal inference. Existing methods that construct confidence intervals for counterfactuals either rely on the assumption of strong ignorability, or need access to un-identifiable lower and upper bounds that characterize the difference between observational and interventional distributions. To overcome these limitations, we first propose a novel approach wTCP-DR based on transductive weighted conformal prediction, which provides confidence intervals for counterfactual outcomes with marginal converage guarantees, even under hidden confounding. With less restrictive assumptions, our approach requires access to a fraction of interventional data (from randomized controlled trials) to account for the covariate shift from observational distributoin to interventional distribution. Theoretical results explicitly demonstrate the conditions under which our algorithm is strictly advantageous to the naive method that only uses interventional data. After ensuring valid intervals on counterfactuals, it is straightforward to construct intervals for individual treatment effects (ITEs). We demonstrate our method across synthetic and real-world data, including recommendation systems, to verify the superiority of our methods compared against state-of-the-art baselines in terms of both coverage and efficiency

Conformal Counterfactual Inference under Hidden Confounding

TL;DR

Addressing counterfactual inference under hidden confounding, the paper introduces wTCP-DR and its cheaper WSCP-DR variant, which use density-ratio estimation to reweight conformal scores and deliver valid marginal confidence intervals without requiring strong ignorability. The methods leverage both observational and a fraction of interventional data, learning to correct distribution shift and provide finite-sample guarantees. Theoretical results quantify when these methods outperform naive interventional-only approaches, and case studies with additive Gaussian noise validate the gains. Empirical evaluations on synthetic and real-world recommender data show improved coverage and tighter intervals over baselines, including cases where interventional labels are scarce or unavailable, highlighting practical value for safe decision-making under confounding.

Abstract

Personalized decision making requires the knowledge of potential outcomes under different treatments, and confidence intervals about the potential outcomes further enrich this decision-making process and improve its reliability in high-stakes scenarios. Predicting potential outcomes along with its uncertainty in a counterfactual world poses the foundamental challenge in causal inference. Existing methods that construct confidence intervals for counterfactuals either rely on the assumption of strong ignorability, or need access to un-identifiable lower and upper bounds that characterize the difference between observational and interventional distributions. To overcome these limitations, we first propose a novel approach wTCP-DR based on transductive weighted conformal prediction, which provides confidence intervals for counterfactual outcomes with marginal converage guarantees, even under hidden confounding. With less restrictive assumptions, our approach requires access to a fraction of interventional data (from randomized controlled trials) to account for the covariate shift from observational distributoin to interventional distribution. Theoretical results explicitly demonstrate the conditions under which our algorithm is strictly advantageous to the naive method that only uses interventional data. After ensuring valid intervals on counterfactuals, it is straightforward to construct intervals for individual treatment effects (ITEs). We demonstrate our method across synthetic and real-world data, including recommendation systems, to verify the superiority of our methods compared against state-of-the-art baselines in terms of both coverage and efficiency
Paper Structure (19 sections, 11 theorems, 28 equations, 8 figures, 5 tables, 4 algorithms)

This paper contains 19 sections, 11 theorems, 28 equations, 8 figures, 5 tables, 4 algorithms.

Key Result

Proposition 1

Under the assumptions that $p^O(x,y)$ and $p^I(x,y)$ are absolutely continuous with each other and that ${ [\mathop{\mathrm{\mathbb{E}}}\nolimits_{p^O(x,y)} \hat{r}(x,y)^2]}^{1/2} < M$ then the confidence interval $C_{\text{wTCP-DR}}$ constructed from alg:wtcp_dr satisfies where $c$ is a constant and $\Delta_r = \mathop{\mathrm{\mathbb{E}}}\nolimits_{p^O(x,y)} |r(x,y) - \hat{r}(x, y)|$ is the app

Figures (8)

  • Figure 1: Under hidden confounding, our proposed methods wTCP-DR and wSCP-DR incorporate a small set of interventional data for density ratio based weighted conformal prediction, which provides marginal coverage guarantee along with high efficiency (small confidence interval). In contrast, WCP lei2021conformal cannot guarantee coverage as hidden confounding leads to biased estimate of propensity scores. The Naive method suffers from low efficiency as it only uses the small set of interventional data.
  • Figure 2: Example causal graph with hidden confounding. $X$: Observed covariates, $U$: Hidden confounders, $T$: Treatment, $Y$: Outcome. Direct edges denote causal relations and the bidirectional edge signifies possible correlation.
  • Figure 3: Coverage results of counterfactual outcomes and ITE with varying hidden confounding strength. Higher dimensional $X$ carries more information of the hidden confounders, leading to weaker hidden confounding. Their interval width results are in Fig. \ref{['fig:impact_dim_x_interval']} of Appendix \ref{['subsec:app_exp_syn']}.
  • Figure 4: Impact of interventional data size $m$ on efficiency of conformal inference methods. See Appendix \ref{['subsec:app_exp_syn']} for coverage results.
  • Figure 5: Coverage and interval width results of counterfactual outcomes and ITE with varying hidden confounding strength. Higher dimensional $X$ carries more information of the hidden confounders, leading to weaker hidden confounding.
  • ...and 3 more figures

Theorems & Definitions (11)

  • Proposition 1: Prosample 4.2 from taufiq2022conformal
  • Theorem 1
  • Proposition 2
  • Proposition 3
  • Proposition 4: Perturb-one stability for OLS
  • Proposition 5: Central limit theorem for weighted quantiles
  • Lemma 1
  • Lemma 2: CDF for ordering statistic
  • Lemma 3: Properties of half-normal distribution
  • Lemma 4: Central Limit Theorem for Quantile
  • ...and 1 more