Journey to the Centre of Cluster: Harnessing Interior Nodes for A/B Testing under Network Interference
Qianyi Chen, Anpeng Wu, Bo Li, Lu Deng, Yong Wang
TL;DR
The paper tackles estimating the global average treatment effect under network interference by exploiting graph clustering. It introduces the mean-in-interior (MII) estimator, which averages outcomes over interior nodes to reduce variance, and proves consistency under mild representativeness assumptions. To address bias from covariate-distribution shifts between interior and boundary units, it proposes an augmented MII (AMII) estimator that uses a counterfactual predictor trained on the full network, framing the approach as a semi-supervised, prediction-powered inference strategy. Extensive simulations demonstrate AMII’s strong bias reduction and robustness across settings, with MII offering minimal variance leakage and AMII providing practical improvements in early-stage experiments. The work offers a principled, scalable path for network-aware A/B testing that balances bias and variance through interior-focused estimation and predictive debiasing.
Abstract
A/B testing on platforms often faces challenges from network interference, where a unit's outcome depends not only on its own treatment but also on the treatments of its network neighbors. To address this, cluster-level randomization has become standard, enabling the use of network-aware estimators. These estimators typically trim the data to retain only a subset of informative units, achieving low bias under suitable conditions but often suffering from high variance. In this paper, we first demonstrate that the interior nodes - units whose neighbors all lie within the same cluster - constitute the vast majority of the post-trimming subpopulation. In light of this, we propose directly averaging over the interior nodes to construct the mean-in-interior (MII) estimator, which circumvents the delicate reweighting required by existing network-aware estimators and substantially reduces variance in classical settings. However, we show that interior nodes are often not representative of the full population, particularly in terms of network-dependent covariates, leading to notable bias. We then augment the MII estimator with a counterfactual predictor trained on the entire network, allowing us to adjust for covariate distribution shifts between the interior nodes and full population. By rearranging the expression, we reveal that our augmented MII estimator embodies an analytical form of the point estimator within prediction-powered inference framework. This insight motivates a semi-supervised lens, wherein interior nodes are treated as labeled data subject to selection bias. Extensive and challenging simulation studies demonstrate the outstanding performance of our augmented MII estimator across various settings.
