Covariate Adjustment in Randomized Experiments Motivated by Higher-Order Influence Functions
Sihui Zhao, Xinbo Wang, Lin Liu, Xin Zhang
TL;DR
The paper addresses covariate adjustment in randomized trials with high-dimensional baseline data by applying Higher-Order Influence Functions (HOIF) to yield order-optimal, design-based estimators for treatment-specific means and ATE. It shows that HOIF-motivated estimators, notably the adj,2 variant and its debiased relatives, can outperform traditional unadjusted or linear-model-based adjustments when $p$ grows with $n$, with formal bias and variance characterizations. Moreover, it demonstrates that several state-of-the-art adjusted estimators are special cases within the HOIF framework, unifying diverse approaches under a common theoretical lens and offering practical, unbiased variance estimators and an R package for implementation. The work includes simulations and a real-data application (NIDA-CTN-0030) to corroborate theoretical results and demonstrates the framework’s relevance for improving efficiency in randomized experiments with high-dimensional covariates.
Abstract
Higher-Order Influence Functions (HOIF), developed in a series of papers over the past twenty years, are a fundamental theoretical device for constructing rate-optimal causal-effect estimators from observational studies. However, the value of HOIF for analyzing well-conducted randomized controlled trials (RCT) has not been explicitly explored. In the recent U.S. Food and Drug Administration and European Medicines Agency guidelines on the practice of covariate adjustment in analyzing RCT, in addition to the simple, unadjusted difference-in-mean estimator, it was also recommended to report the estimator adjusting for baseline covariates via a simple parametric working model, such as a linear model. However, when the number of baseline covariates $p$ is large, the recommendation is somewhat murky. In this paper, we show that HOIF-motivated estimators for the treatment-specific mean have significantly improved statistical properties compared to popular adjusted estimators in practice when $p$ is relatively large relative to the sample size $n$. We also characterize the conditions under which the HOIF-motivated estimator improves upon the unadjusted one. More importantly, we demonstrate that several state-of-the-art adjusted estimators proposed recently can be interpreted as particular HOIF-motivated estimators, thereby placing these estimators in a more unified framework. Numerical and empirical studies are conducted to corroborate our theoretical findings. An accompanying R package can be found on CRAN.
