How Much Weak Overlap Can Doubly Robust T-Statistics Handle?
Jacob Dorn
TL;DR
This work addresses causal effect estimation under weak overlap by proposing a thresholded doubly robust AIPW estimator that remains asymptotically normal and yields well-calibrated Wald confidence intervals when nuisance components are estimated nonparametrically with cross-fitting. It derives precise rate conditions linking the clipping threshold $b_n$, nuisance errors $r_{\mu,n}, r_{e,n}$, and the overlap tail parameter $\gamma_0$, showing a nontrivial cost of weak overlap on outcome-smoothing and black-box nuisance requirements. The paper also shows that the global regression rate under weak overlap equals the pointwise rate, removing polylogarithmic penalties, and provides actionable thresholding rules for empirical use. In simulations and an application to right-heart catheterization, thresholded AIPW demonstrates favorable calibration and comparable efficiency to standard fixed-trimming approaches, supporting its practical appeal for full-population estimands with weak overlap.
Abstract
In the presence of sufficiently weak overlap, it is known that no regular root-n-consistent estimators exist and standard estimators may fail to be asymptotically normal. This paper shows that a thresholded version of the standard doubly robust estimator is asymptotically normal with well-calibrated Wald confidence intervals even when constructed using nonparametric estimates of the propensity score and conditional mean outcome. The analysis implies a cost of weak overlap in terms of black-box nuisance rates, borne when the semiparametric bound is infinite, and the contribution of outcome smoothness to the outcome regression rate, which is incurred even when the semiparametric bound is finite. As a byproduct of this analysis, I show that under weak overlap, the optimal global regression rate is the same as the optimal pointwise regression rate, without the usual polylogarithmic penalty. The high-level conditions yield new rules of thumb for thresholding in practice. In simulations, thresholded AIPW can exhibit moderate overrejection in small samples, but I am unable to reject a null hypothesis of exact coverage in large samples. In an empirical application, the clipped AIPW estimator that targets the standard average treatment effect yields similar precision to a heuristic 10% fixed-trimming approach that changes the target sample.
