Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences
Runpeng Dai, Jianing Wang, Fan Zhou, Shikai Luo, Zhiwei Qin, Chengchun Shi, Hongtu Zhu
TL;DR
This work addresses off-policy evaluation under complex spatio-temporal interference by introducing a causal deepset framework that learns permutation-invariant mean-outcome functions without relying on parametric mean-field assumptions. The core innovation, the Permutation Invariant Estimator (PIE), aggregates neighbor information through learned neural blocks that are invariant to neighbor ordering, enabling flexible interference modeling. PIE is integrated with three OPE strategies (value-based, importance sampling, and doubly robust) and extended to dynamic settings via an MDP/MARL formulation, with theoretical guarantees including consistency, convergence rates, and minimax optimality. Empirical results across synthetic nondynamic and dynamic simulations, as well as real-data-based simulations, show substantial improvements over mean-field baselines, highlighting the approach’s practicality for ride-sharing and related spatial-temporal domains. A Python implementation is provided at the authors’ repository.
Abstract
Off-policy evaluation (OPE) is widely applied in sectors such as pharmaceuticals and e-commerce to evaluate the efficacy of novel products or policies from offline datasets. This paper introduces a causal deepset framework that relaxes several key structural assumptions, primarily the mean-field assumption, prevalent in existing OPE methodologies that handle spatio-temporal interference. These traditional assumptions frequently prove inadequate in real-world settings, thereby restricting the capability of current OPE methods to effectively address complex interference effects. In response, we advocate for the implementation of the permutation invariance (PI) assumption. This innovative approach enables the data-driven, adaptive learning of the mean-field function, offering a more flexible estimation method beyond conventional averaging. Furthermore, we present novel algorithms that incorporate the PI assumption into OPE and thoroughly examine their theoretical foundations. Our numerical analyses demonstrate that this novel approach yields significantly more precise estimations than existing baseline algorithms, thereby substantially improving the practical applicability and effectiveness of OPE methodologies. A Python implementation of our proposed method is available at https://github.com/BIG-S2/Causal-Deepsets.
