Treatment Effect Estimation with Observational Network Data using Machine Learning
Corinne Emmenegger, Meta-Lina Spohn, Timon Elmer, Peter Bühlmann
TL;DR
The paper addresses causal inference for treatment effects when units interact on a known network, where spillovers violate independence. It develops a semiparametric network AIPW estimator for the expected average treatment effect ($EATE$) under a structural equation model, using cross-fitting to accommodate flexible ML nuisance estimation and a dependency-graph–based CLT to handle network dependence. The estimator attains $\sqrt{N}$-consistency and asymptotic normality with a bootstrap-consistent variance estimator, enabling valid confidence intervals and p-values for a single network. Empirical validation includes simulations under various network topologies and an application to the Swiss StudentLife data, showing that accounting for spillovers alters estimated effects and improves inference. The approach provides a practical, theoretically justified framework for unit-level causal effects in networks, with extensions to global effects (GATE) discussed.
Abstract
Causal inference methods for treatment effect estimation usually assume independent units. However, this assumption is often questionable because units may interact, resulting in spillover effects between them. We develop augmented inverse probability weighting (AIPW) for estimation and inference of the expected average treatment effect (EATE) with observational data from a single (social) network with spillover effects. In contrast to overall effects such as the global average treatment effect (GATE), the EATE measures, in expectation and on average over all units, how the outcome of a unit is causally affected by its own treatment, marginalizing over the spillover effects from other units. We develop cross-fitting theory with plugin machine learning to obtain a semiparametric treatment effect estimator that converges at the parametric rate and asymptotically follows a Gaussian distribution. The asymptotics are developed using the dependency graph rather than the network graph, which makes explicit that we allow for spillover effects beyond immediate neighbors in the network. We apply our AIPW method to the Swiss StudentLife Study data to investigate the effect of hours spent studying on exam performance accounting for the students' social network.
