Causal Inference with High-Dimensional Treatments

Patrick Kramer; Edward H. Kennedy; Isaac M. Opper

Causal Inference with High-Dimensional Treatments

Patrick Kramer, Edward H. Kennedy, Isaac M. Opper

Abstract

In this work, we consider causal inference in various high-dimensional treatment settings, including for single multi-valued treatments and vector treatments with binary or continuous components, when the number of treatments can be comparable to or even larger than the number of observations. These settings bring unique challenges: first, the treatment effects of interest are a high-dimensional vector rather than a low-dimensional scalar; second, positivity violations are often unavoidable; and third, estimation can be based on a smaller effective sample size. We first discuss fundamental limits of estimating effects here, showing that consistent estimation is impossible without further assumptions. We go on to propose a novel sparse pseudo-outcome regression framework for arbitrary high-dimensional statistical functionals, which includes generic constrained regression estimators and error guarantees. We use the framework to derive new doubly robust estimators for mean potential outcomes of high-dimensional treatments, though it can also be applied to other scenarios. We analyze the proposed estimators under exact and approximate sparsity assumptions, giving finite-sample risk bounds. Finally, we derive minimax lower bounds to characterize optimal rates of convergence and show our risk bounds are unimprovable.

Causal Inference with High-Dimensional Treatments

Abstract

Paper Structure (88 sections, 19 theorems, 232 equations)

This paper contains 88 sections, 19 theorems, 232 equations.

Introduction
Our Contributions
Related work
Setup & Notation
High-Dimensional Treatments
Types of High-Dimensional Treatments
Single Multi-Valued Treatments
Binary Vector Treatments
Continuous Vector Treatments
Positivity Violation
Pseudo-Outcome Regression on High-Dimensional Treatments
Motivation
Proposed Estimators
Master Theorem
Single Multi-Valued Treatments
...and 73 more sections

Key Result

Theorem 1

Let $\widehat{\psi}\in\{\widehat{\psi}_\text{lasso}, \widehat{\psi}_\text{subset}\}$ be either the Lasso estimator defined in (def:lasso) or the best subset selection estimator defined in (def:bestsubset) minimizing the empirical risk in (eq:emp-risk). Suppose exact sparsity, i.e., $\lVert\psi\rVert for rates $r_1 = r_1(k)\gtrsim 1, r_2=r_2(n,k)$, and constants $R,C_1,C_2,C_3,C_4>0$ that are indep

Theorems & Definitions (42)

Remark 1
Definition 1: Best subset selection and Lasso estimator
Theorem 1: Error rate under exact sparsity
Remark 2: Discussion of the assumptions
Remark 3: Exact versus approximate sparsity
Remark 4: Error metric
Remark 5: Weighing the treatments
Remark 6: Choosing a suitable pseudo-outcome
Remark 7: Choice of the constraint parameter
Remark 8: Connection to Gaussian sequence model
...and 32 more

Causal Inference with High-Dimensional Treatments

Abstract

Causal Inference with High-Dimensional Treatments

Authors

Abstract

Table of Contents

Key Result

Theorems & Definitions (42)