Causal clustering: design of cluster experiments under network interference
Davide Viviano, Lihua Lei, Guido Imbens, Brian Karrer, Okke Schrijvers, Liang Shi
TL;DR
The paper addresses estimating the global average treatment effect under network interference, where spillovers complicate cluster design. It introduces Causal Clustering, an algorithm that minimizes a worst-case MSE by solving penalized min-cut problems via SDP relaxations, yielding a Pareto frontier between bias and variance. The authors derive closed-form worst-case bias and variance expressions, establish a practical Bernoulli-vs-cluster design rule, and validate the method with Facebook network data and field data, showing how clustering choices and the number of clusters affect inference. The approach provides a principled, scalable guide for cluster design in network settings, applicable to online experiments and field trials.
Abstract
This paper studies the design of cluster experiments to estimate the global treatment effect in the presence of network spillovers. We provide a framework to choose the clustering that minimizes the worst-case mean-squared error of the estimated global effect. We show that optimal clustering solves a novel penalized min-cut optimization problem computed via off-the-shelf semi-definite programming algorithms. Our analysis also characterizes simple conditions to choose between any two cluster designs, including choosing between a cluster or individual-level randomization. We illustrate the method's properties using unique network data from the universe of Facebook's users and existing data from a field experiment.
