Table of Contents
Fetching ...

Accounting for Missing Covariates in Heterogeneous Treatment Estimation

Khurram Yamin, Vibhhu Sharma, Ed Kennedy, Bryan Wilder

Abstract

Many applications of causal inference require using treatment effects estimated on a study population to make decisions in a separate target population. We consider the challenging setting where there are covariates that are observed in the target population that were not seen in the original study. Our goal is to estimate the tightest possible bounds on heterogeneous treatment effects conditioned on such newly observed covariates. We introduce a novel partial identification strategy based on ideas from ecological inference; the main idea is that estimates of conditional treatment effects for the full covariate set must marginalize correctly when restricted to only the covariates observed in both populations. Furthermore, we introduce a bias-corrected estimator for these bounds and prove that it enjoys fast convergence rates and statistical guarantees (e.g., asymptotic normality). Experimental results on both real and synthetic data demonstrate that our framework can produce bounds that are much tighter than would otherwise be possible.

Accounting for Missing Covariates in Heterogeneous Treatment Estimation

Abstract

Many applications of causal inference require using treatment effects estimated on a study population to make decisions in a separate target population. We consider the challenging setting where there are covariates that are observed in the target population that were not seen in the original study. Our goal is to estimate the tightest possible bounds on heterogeneous treatment effects conditioned on such newly observed covariates. We introduce a novel partial identification strategy based on ideas from ecological inference; the main idea is that estimates of conditional treatment effects for the full covariate set must marginalize correctly when restricted to only the covariates observed in both populations. Furthermore, we introduce a bias-corrected estimator for these bounds and prove that it enjoys fast convergence rates and statistical guarantees (e.g., asymptotic normality). Experimental results on both real and synthetic data demonstrate that our framework can produce bounds that are much tighter than would otherwise be possible.

Paper Structure

This paper contains 17 sections, 4 theorems, 57 equations, 4 figures, 1 algorithm.

Key Result

Theorem 1

Assuming Y is a real-valued outcome bounded in [a,b] such that where $\mu_a(v) = \mathbb{E}(Y \mid V=v, A=a, E=1),\\ \nu(v,w) = \mathbb{P}(W=w \mid V=v, E=0)$

Figures (4)

  • Figure 1: Top: difference in estimation error between the plug in and bias corrected estimators under varying errors in outcome and propensity modeling. Red is higher loss for the plug-in. All errors are measured in mean absolute deviation. Bottom: average worst-case bound width as a function of the entropy of $\nu$ where the range of outcomes is 40.
  • Figure 2: Simulation Data -- Top: bounds in the sensitivity model as a function of $\delta$, averaged over units. The red and blue lines are bounds output by our method (Bias Correction), with a 95$\%$ CI. The black lines are bounds that use only the sensitivity assumption in Equation \ref{['eq:sensitivity']} without our ecological inference framework. Bottom: Green line is the fraction of times our bounds cover the true CATE compared to blue line which is the coverage resulting 95$\%$ CI of the Restricted CATE
  • Figure 3: RCT Data: labeling matches Figure \ref{['fig:sensitivity']}
  • Figure 4: Benchmarking distribution for Left: Simulation Data, Right: RCT Data. Final $\hat{\delta}$ represents the mean of each distribution and is within $\sim 10\%$ of both true $\delta$. We consider all potential subsets of V' and W' that can be made from V given the number of variables in W' (3 for Simulation and 1 for RCT)

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4