Spatially Randomized Designs Can Enhance Policy Evaluation
Ying Yang, Chengchun Shi, Fang Yao, Shouyang Wang, Hongtu Zhu
TL;DR
The paper tackles policy evaluation under spatial interference by introducing spatially randomized designs (global, individual, cluster) and providing parametric, semiparametric, and dynamic estimation frameworks. It derives MSE and power results, showing that spacerandomization can substantially improve estimator efficiency and testing power, with the optimal cluster size scaling as $c^*\asymp r$. The work introduces both traditional and doubly robust methods, including a dynamic DRL approach with mean-field approximations to mitigate the high-dimensionality of spatio-temporal data. Empirical validation through simulations and a real ride-hailing dataset demonstrates consistent performance gains over global designs, supporting practical adoption in large-scale online experiments. The findings offer actionable guidance for designing efficient A/B tests in networks with interference, with implications for ride-sharing, e-commerce, and digital advertising platforms.
Abstract
This article studies the benefits of using spatially randomized experimental designs which partition the experimental area into distinct, non-overlapping units with treatments assigned randomly. Such designs offer improved policy evaluation in online experiments by providing more precise policy value estimators and more effective A/B testing algorithms than traditional global designs, which apply the same treatment across all units simultaneously. We examine both parametric and nonparametric methods for estimating and inferring policy values based on this randomized approach. Our analysis includes evaluating the mean squared error of the treatment effect estimator and the statistical power of the associated tests. Additionally, we extend our findings to experiments with spatio-temporal dependencies, where treatments are allocated sequentially over time, and account for potential temporal carryover effects. Our theoretical insights are supported by comprehensive numerical experiments.
