Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
Stefan Wager, Susan Athey
TL;DR
This paper develops causal forests, a non-parametric method extending Breiman's random forests to estimate heterogeneous treatment effects under unconfoundedness. It proves pointwise consistency and asymptotic normality for the conditional average treatment effect $\tau(x)$ and provides a practical, consistent variance estimator via the infinitesimal jackknife, applicable to a broad class of forest variants including honest and double-sample trees. Through simulations, causal forests outperform classical nearest-neighbor methods in mean-squared error and achieve nominal confidence interval coverage in moderate to high dimensions, highlighting their practical value for individualized treatment decisions. The work thus enables reliable, data-driven inference on treatment effect heterogeneity in high-dimensional settings, with implications for personalized medicine, policy evaluation, and targeted marketing.
Abstract
Many scientific and engineering challenges -- ranging from personalized medicine to customized marketing recommendations -- require an understanding of treatment effect heterogeneity. In this paper, we develop a non-parametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect, and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.
