Table of Contents
Fetching ...

Causal Inference with High-Dimensional Treatments

Patrick Kramer, Edward H. Kennedy, Isaac M. Opper

Abstract

In this work, we consider causal inference in various high-dimensional treatment settings, including for single multi-valued treatments and vector treatments with binary or continuous components, when the number of treatments can be comparable to or even larger than the number of observations. These settings bring unique challenges: first, the treatment effects of interest are a high-dimensional vector rather than a low-dimensional scalar; second, positivity violations are often unavoidable; and third, estimation can be based on a smaller effective sample size. We first discuss fundamental limits of estimating effects here, showing that consistent estimation is impossible without further assumptions. We go on to propose a novel sparse pseudo-outcome regression framework for arbitrary high-dimensional statistical functionals, which includes generic constrained regression estimators and error guarantees. We use the framework to derive new doubly robust estimators for mean potential outcomes of high-dimensional treatments, though it can also be applied to other scenarios. We analyze the proposed estimators under exact and approximate sparsity assumptions, giving finite-sample risk bounds. Finally, we derive minimax lower bounds to characterize optimal rates of convergence and show our risk bounds are unimprovable.

Causal Inference with High-Dimensional Treatments

Abstract

In this work, we consider causal inference in various high-dimensional treatment settings, including for single multi-valued treatments and vector treatments with binary or continuous components, when the number of treatments can be comparable to or even larger than the number of observations. These settings bring unique challenges: first, the treatment effects of interest are a high-dimensional vector rather than a low-dimensional scalar; second, positivity violations are often unavoidable; and third, estimation can be based on a smaller effective sample size. We first discuss fundamental limits of estimating effects here, showing that consistent estimation is impossible without further assumptions. We go on to propose a novel sparse pseudo-outcome regression framework for arbitrary high-dimensional statistical functionals, which includes generic constrained regression estimators and error guarantees. We use the framework to derive new doubly robust estimators for mean potential outcomes of high-dimensional treatments, though it can also be applied to other scenarios. We analyze the proposed estimators under exact and approximate sparsity assumptions, giving finite-sample risk bounds. Finally, we derive minimax lower bounds to characterize optimal rates of convergence and show our risk bounds are unimprovable.
Paper Structure (88 sections, 19 theorems, 232 equations)

This paper contains 88 sections, 19 theorems, 232 equations.

Key Result

Theorem 1

Let $\widehat{\psi}\in\{\widehat{\psi}_\text{lasso}, \widehat{\psi}_\text{subset}\}$ be either the Lasso estimator defined in (def:lasso) or the best subset selection estimator defined in (def:bestsubset) minimizing the empirical risk in (eq:emp-risk). Suppose exact sparsity, i.e., $\lVert\psi\rVert for rates $r_1 = r_1(k)\gtrsim 1, r_2=r_2(n,k)$, and constants $R,C_1,C_2,C_3,C_4>0$ that are indep

Theorems & Definitions (42)

  • Remark 1
  • Definition 1: Best subset selection and Lasso estimator
  • Theorem 1: Error rate under exact sparsity
  • Remark 2: Discussion of the assumptions
  • Remark 3: Exact versus approximate sparsity
  • Remark 4: Error metric
  • Remark 5: Weighing the treatments
  • Remark 6: Choosing a suitable pseudo-outcome
  • Remark 7: Choice of the constraint parameter
  • Remark 8: Connection to Gaussian sequence model
  • ...and 32 more