Table of Contents
Fetching ...

Cross-World Assumption and Refining Prediction Intervals for Individual Treatment Effects

Juraj Bodik, Yaxuan Huang, Bin Yu

Abstract

While average treatment effects (ATE) and conditional average treatment effects (CATE) provide valuable population- and subgroup-level summaries, they fail to capture uncertainty at the individual level. For high-stakes decision-making, individual treatment effect (ITE) estimates must be accompanied by valid prediction intervals that reflect heterogeneity and unit-specific uncertainty. However, the fundamental unidentifiability of ITEs limits the ability to derive precise and reliable individual-level uncertainty estimates. To address this challenge, we investigate the role of a cross-world correlation parameter, $ ρ(x) = cor(Y(1), Y(0) | X = x) $, which describes the dependence between potential outcomes, given covariates, in the Neyman-Rubin super-population model with i.i.d. units. Although $ ρ$ is fundamentally unidentifiable, we argue that in most real-world applications, it is possible to impose reasonable and interpretable bounds informed by domain-expert knowledge. Given $ρ$, we design prediction intervals for ITE, achieving more stable and accurate coverage with substantially shorter widths; often less than 1/3 of those from competing methods. The resulting intervals satisfy coverage guarantees $P\big(Y(1) - Y(0) \in C_{ITE}(X)\big) \geq 1 - α$ and are asymptotically optimal under Gaussian assumptions. We provide strong theoretical and empirical arguments that cross-world assumptions can make individual uncertainty quantification both practically informative and statistically valid.

Cross-World Assumption and Refining Prediction Intervals for Individual Treatment Effects

Abstract

While average treatment effects (ATE) and conditional average treatment effects (CATE) provide valuable population- and subgroup-level summaries, they fail to capture uncertainty at the individual level. For high-stakes decision-making, individual treatment effect (ITE) estimates must be accompanied by valid prediction intervals that reflect heterogeneity and unit-specific uncertainty. However, the fundamental unidentifiability of ITEs limits the ability to derive precise and reliable individual-level uncertainty estimates. To address this challenge, we investigate the role of a cross-world correlation parameter, , which describes the dependence between potential outcomes, given covariates, in the Neyman-Rubin super-population model with i.i.d. units. Although is fundamentally unidentifiable, we argue that in most real-world applications, it is possible to impose reasonable and interpretable bounds informed by domain-expert knowledge. Given , we design prediction intervals for ITE, achieving more stable and accurate coverage with substantially shorter widths; often less than 1/3 of those from competing methods. The resulting intervals satisfy coverage guarantees and are asymptotically optimal under Gaussian assumptions. We provide strong theoretical and empirical arguments that cross-world assumptions can make individual uncertainty quantification both practically informative and statistically valid.

Paper Structure

This paper contains 36 sections, 17 theorems, 148 equations, 9 figures, 2 tables.

Key Result

Lemma 2.1

Assume that the relevant conditional second moments exist. Denote $\sigma_t(x):=\mathrm{Var}\{Y(t)\mid X=x\}$, and $\sigma_t(x,z):=\mathrm{Var}\{Y(t)\mid X=x,Z=z\}$, $t=0,1$. Then and consequently Additionally, if we assume $\mathrm{Cov}\{Y(0),Y(1)\mid X,Z\}\ge 0$, then the bound tightens to

Figures (9)

  • Figure 1: Difference between perfectly negatively dependent ($\rho = -1$) and perfectly positively dependent ($\rho = 1$) potential outcomes. In both cases, the observed variables are the same. However, much larger prediction intervals for ITEs are needed when $\rho = -1$, while $\rho = 1$ leads to the narrowest possible intervals. The solid lines represent conditional expected values.
  • Figure 2: Synthetic $d=1$ dataset described in Section \ref{['Section5.1']}. The variance of ITE drastically decreases with larger $\rho$, and our $CW(\rho)$ intervals seem to have correct coverage. We also include exact and inexact prediction method from Lei_2021_Conformal_Inference as a comparison. Note that the observed values (treated and untreated units) are the same on all four figures.
  • Figure 3: Box plots comparing the coverage and average interval width of different estimation methods across cross-world correlation levels $\rho \in \{-1, -0.5, 0, 0.5, 1\}$, based on 50 synthetic datasets with $n = 2000$ and $d=1$ and $d=15$. Methods include Lei’s exact and inexact intervals Lei_2021_Conformal_Inference, the conformal DR-meta-learner Alaa_2023_Conformal_Meta_learners, CMC jonkers2024conformalconvolutionmontecarlo, and three versions of our $CW(\rho)$ intervals: the baseline $CW(\rho)$, a misspecified variant $CW(\text{misspec.}\rho)$ using $\tilde{\rho} = \rho - 0.25$ (capped at $-1$), and one with added conformal confidence intervals ($CW^{+CI}(\rho)$). The vertical line marks the desired $90\%$ coverage.
  • Figure 4: Average gap between the true cross-world correlation and its estimator as a function of the predictive power of the auxiliary variable $\gamma$. The left panel reports $\mathrm{GAP}=\mathbb{E}_X\!\left[\rho(X)-\widehat{\tilde{\rho}}(X)\right]$, while the right panel reports $\mathrm{GAP}=\mathbb{E}_X\!\left[\rho(X)-\widehat{\rho}(X)\right]$. Shaded bands are $95\%$ confidence intervals computed from repeated simulations as described in Appendix A; the horizontal line at $0$ indicates no bias.
  • Figure 5: Average gap between the true cross-world correlation and its estimator as a function of the predictive power of the auxiliary variable $\gamma$ and dimension of the auxiliary variable $Z$. The upper panels report $\mathrm{GAP}(\gamma)=\mathbb{E}_X\!\left[\rho(X)-\widehat{\tilde{\rho}}(X)\right]$, while the lower panels report $\mathrm{GAP}(\gamma)=\mathbb{E}_X\!\left[\rho(X)-\widehat{\rho}(X)\right]$. Shaded bands are $95\%$ confidence intervals; the horizontal line at $0$ indicates no bias.
  • ...and 4 more figures

Theorems & Definitions (31)

  • Definition 1: Cross-world assumption
  • Lemma 2.1
  • Definition 2: $CW(\rho)$ intervals
  • Theorem 1: Motivation and optimality under a perfect (asymptotic) scenario
  • Theorem 2: Gaussian case
  • Definition 3
  • Theorem 3: Conditional validity of $CW^{+CI}(\rho)$ intervals when $\rho = 1$
  • Lemma A.1: Idea; Full statement in Lemma \ref{['lem_appendix_rho_lower_consistency']}
  • Theorem 4: Largest-width Scenario $\rho = -1$
  • Theorem 5: Assuming $\rho = 1$ is sufficient
  • ...and 21 more