Table of Contents
Fetching ...

Debiased Inverse Propensity Score Weighting for Estimation of Average Treatment Effects with High-Dimensional Confounders

Yuhao Wang, Rajen D. Shah

Abstract

We consider estimation of average treatment effects given observational data with high-dimensional pretreatment variables. Existing methods for this problem typically assume some form of sparsity for the regression functions. In this work, we introduce a debiased inverse propensity score weighting (DIPW) scheme for average treatment effect estimation that delivers $\sqrt{n}$-consistent estimates when the propensity score follows a sparse logistic regression model; the outcome regression functions are permitted to be arbitrarily complex. We further demonstrate how confidence intervals centred on our estimates may be constructed. Our theoretical results quantify the price to pay for permitting the regression functions to be unestimable, which shows up as an inflation of the variance of the estimator compared to the semiparametric efficient variance by a constant factor, under mild conditions. We also show that when outcome regressions can be estimated faster than a slow $1/\sqrt{ \log n}$ rate, our estimator achieves semiparametric efficiency. As our results accommodate arbitrary outcome regression functions, averages of transformed responses under each treatment may also be estimated at the $\sqrt{n}$ rate. Thus, for example, the variances of the potential outcomes may be estimated. We discuss extensions to estimating linear projections of the heterogeneous treatment effect function and explain how propensity score models with more general link functions may be handled within our framework. An R package \texttt{dipw} implementing our methodology is available on CRAN.

Debiased Inverse Propensity Score Weighting for Estimation of Average Treatment Effects with High-Dimensional Confounders

Abstract

We consider estimation of average treatment effects given observational data with high-dimensional pretreatment variables. Existing methods for this problem typically assume some form of sparsity for the regression functions. In this work, we introduce a debiased inverse propensity score weighting (DIPW) scheme for average treatment effect estimation that delivers -consistent estimates when the propensity score follows a sparse logistic regression model; the outcome regression functions are permitted to be arbitrarily complex. We further demonstrate how confidence intervals centred on our estimates may be constructed. Our theoretical results quantify the price to pay for permitting the regression functions to be unestimable, which shows up as an inflation of the variance of the estimator compared to the semiparametric efficient variance by a constant factor, under mild conditions. We also show that when outcome regressions can be estimated faster than a slow rate, our estimator achieves semiparametric efficiency. As our results accommodate arbitrary outcome regression functions, averages of transformed responses under each treatment may also be estimated at the rate. Thus, for example, the variances of the potential outcomes may be estimated. We discuss extensions to estimating linear projections of the heterogeneous treatment effect function and explain how propensity score models with more general link functions may be handled within our framework. An R package \texttt{dipw} implementing our methodology is available on CRAN.

Paper Structure

This paper contains 51 sections, 34 theorems, 284 equations, 6 figures, 1 table, 1 algorithm.

Key Result

Lemma \oldthetheorem

The function $\mu_i$ of $X_1, \ldots, X_n$ that minimises $\mathrm{Var}(\tau_{\mathrm{ORA}, i})$ is $\mu_i(X_1,\ldots,X_n) = \mu_{\mathrm{ORA}, i}$.

Figures (6)

  • Figure 1: Boxplots of the estimation error $|\hat{\tau} - \bar{\tau}|$ under different covariate designs and linear functions $b(\cdot)$ and $\Delta(\cdot)$ with different sparsity levels $s$ for the propensity model coefficients; the white dots correspond to means.
  • Figure 2: As Figure \ref{['fig:linear']} but with nonlinear functions $b(\cdot)$ and $\Delta(\cdot)$.
  • Figure 3: Boxplots of the error in estimating $\mathrm{Var}(Y(1))$; the interpretation is analogous to Figure \ref{['fig:linear']}. For the Toeplitz design settings with $s = 20, 50$, due to the long error bar for TMLE and TMLErf, we do not display their boxplots in the figure. The median absolute errors of TMLE and TMLErf are $6.4, 10.8$ for $s = 20$, and $81.8, 205.1$ for $s = 50$.
  • Figure 4: Boxplot of estimation error with heteroscedastic noise and linear responses.
  • Figure 5: Boxplot of estimation error with heteroscedastic noise and nonlinear responses.
  • ...and 1 more figures

Theorems & Definitions (60)

  • Lemma \oldthetheorem
  • Theorem \oldthetheorem
  • Theorem \oldthetheorem
  • Theorem \oldthetheorem
  • Theorem \oldthetheorem
  • Corollary \oldthetheorem
  • Lemma \oldthetheorem
  • proof
  • Lemma \oldthetheorem
  • proof
  • ...and 50 more