Table of Contents
Fetching ...

How Much Weak Overlap Can Doubly Robust T-Statistics Handle?

Jacob Dorn

TL;DR

This work addresses causal effect estimation under weak overlap by proposing a thresholded doubly robust AIPW estimator that remains asymptotically normal and yields well-calibrated Wald confidence intervals when nuisance components are estimated nonparametrically with cross-fitting. It derives precise rate conditions linking the clipping threshold $b_n$, nuisance errors $r_{\mu,n}, r_{e,n}$, and the overlap tail parameter $\gamma_0$, showing a nontrivial cost of weak overlap on outcome-smoothing and black-box nuisance requirements. The paper also shows that the global regression rate under weak overlap equals the pointwise rate, removing polylogarithmic penalties, and provides actionable thresholding rules for empirical use. In simulations and an application to right-heart catheterization, thresholded AIPW demonstrates favorable calibration and comparable efficiency to standard fixed-trimming approaches, supporting its practical appeal for full-population estimands with weak overlap.

Abstract

In the presence of sufficiently weak overlap, it is known that no regular root-n-consistent estimators exist and standard estimators may fail to be asymptotically normal. This paper shows that a thresholded version of the standard doubly robust estimator is asymptotically normal with well-calibrated Wald confidence intervals even when constructed using nonparametric estimates of the propensity score and conditional mean outcome. The analysis implies a cost of weak overlap in terms of black-box nuisance rates, borne when the semiparametric bound is infinite, and the contribution of outcome smoothness to the outcome regression rate, which is incurred even when the semiparametric bound is finite. As a byproduct of this analysis, I show that under weak overlap, the optimal global regression rate is the same as the optimal pointwise regression rate, without the usual polylogarithmic penalty. The high-level conditions yield new rules of thumb for thresholding in practice. In simulations, thresholded AIPW can exhibit moderate overrejection in small samples, but I am unable to reject a null hypothesis of exact coverage in large samples. In an empirical application, the clipped AIPW estimator that targets the standard average treatment effect yields similar precision to a heuristic 10% fixed-trimming approach that changes the target sample.

How Much Weak Overlap Can Doubly Robust T-Statistics Handle?

TL;DR

This work addresses causal effect estimation under weak overlap by proposing a thresholded doubly robust AIPW estimator that remains asymptotically normal and yields well-calibrated Wald confidence intervals when nuisance components are estimated nonparametrically with cross-fitting. It derives precise rate conditions linking the clipping threshold , nuisance errors , and the overlap tail parameter , showing a nontrivial cost of weak overlap on outcome-smoothing and black-box nuisance requirements. The paper also shows that the global regression rate under weak overlap equals the pointwise rate, removing polylogarithmic penalties, and provides actionable thresholding rules for empirical use. In simulations and an application to right-heart catheterization, thresholded AIPW demonstrates favorable calibration and comparable efficiency to standard fixed-trimming approaches, supporting its practical appeal for full-population estimands with weak overlap.

Abstract

In the presence of sufficiently weak overlap, it is known that no regular root-n-consistent estimators exist and standard estimators may fail to be asymptotically normal. This paper shows that a thresholded version of the standard doubly robust estimator is asymptotically normal with well-calibrated Wald confidence intervals even when constructed using nonparametric estimates of the propensity score and conditional mean outcome. The analysis implies a cost of weak overlap in terms of black-box nuisance rates, borne when the semiparametric bound is infinite, and the contribution of outcome smoothness to the outcome regression rate, which is incurred even when the semiparametric bound is finite. As a byproduct of this analysis, I show that under weak overlap, the optimal global regression rate is the same as the optimal pointwise regression rate, without the usual polylogarithmic penalty. The high-level conditions yield new rules of thumb for thresholding in practice. In simulations, thresholded AIPW can exhibit moderate overrejection in small samples, but I am unable to reject a null hypothesis of exact coverage in large samples. In an empirical application, the clipped AIPW estimator that targets the standard average treatment effect yields similar precision to a heuristic 10% fixed-trimming approach that changes the target sample.

Paper Structure

This paper contains 26 sections, 41 theorems, 128 equations, 14 figures, 1 table, 1 algorithm.

Key Result

Proposition 1

(i) Suppose def:AllowedDistributions holds for some $\gamma_0 > 2$. Then the semiparametric bound is finite for all $P \in \mathscr{P}$. (ii) Suppose def:AllowedDistributions holds for some $\gamma_0 \in (1, 2)$, and there is a $P \in \mathscr{P}$ and $C' > 0$ such that $P( e(X) \leq \pi ) \geq C' \

Figures (14)

  • Figure 1: Simulations of 10,000 observations of $e(X)$ with $P(e(X) \leq \pi) = \pi^{\gamma_0 - 1}$ for increasing values of $\gamma_0$.
  • Figure 2: Histograms of point estimates in simulations for the various methods considered in the simulations. Vertical dotted and solid lines indicate true causal effect and median estimate, respectively. Clipped estimators achieve much better performance than unthresholded estimators, and clipped AIPW's debiasing property is also apparent.
  • Figure 3: Histograms of simulated t-statistics on the true null hypothesis for various sample sizes. Vertical solid and dotted lines indicate mean t-statistic and target mean t-statistic of zero, respectively. Dashed line corresponds to the calibrated Gaussian density targeted in the Shaprio-Wilk test for normality.
  • Figure 4: Histograms of simulation p-values on null hypothesis of true average potential outcome for various sample sizes. Dotted lines correspond to the target Uniform(0, 1) density. P-values in labels correspond to Kolmogorov-Smirnov tests for the Uniform(0, 1) distribution.
  • Figure 5: Simulated treated observations for one simulation of 1,000 observations. It is rare to see treated observations with small $X$, which corresponds to small values of $e(X) = X^{1/(\gamma_0 - 1)}$. As a result, such observations can have high leverage when predicting $E[Y \mid X = 0, D = 1]$, and can yield to important errors between the true (dashed) and predicted (solid) regression lines.
  • ...and 9 more figures

Theorems & Definitions (90)

  • Proposition 1
  • Proposition 2: Consistency
  • Theorem 1: (Slow) Asymptotic Normality
  • Corollary 1: T-tests are well-calibrated
  • Corollary 2: Thresholding is second-order under somewhat weak overlap
  • Proposition 3: Consistency rate
  • Corollary 3: Worst-case consistency rate
  • Corollary 4: Under very weak overlap, faster rates are necessary
  • Example 1: Somewhat weak overlap
  • Example 2: Second moments barely fail to exist
  • ...and 80 more