Table of Contents
Fetching ...

Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences

Runpeng Dai, Jianing Wang, Fan Zhou, Shikai Luo, Zhiwei Qin, Chengchun Shi, Hongtu Zhu

TL;DR

This work addresses off-policy evaluation under complex spatio-temporal interference by introducing a causal deepset framework that learns permutation-invariant mean-outcome functions without relying on parametric mean-field assumptions. The core innovation, the Permutation Invariant Estimator (PIE), aggregates neighbor information through learned neural blocks that are invariant to neighbor ordering, enabling flexible interference modeling. PIE is integrated with three OPE strategies (value-based, importance sampling, and doubly robust) and extended to dynamic settings via an MDP/MARL formulation, with theoretical guarantees including consistency, convergence rates, and minimax optimality. Empirical results across synthetic nondynamic and dynamic simulations, as well as real-data-based simulations, show substantial improvements over mean-field baselines, highlighting the approach’s practicality for ride-sharing and related spatial-temporal domains. A Python implementation is provided at the authors’ repository.

Abstract

Off-policy evaluation (OPE) is widely applied in sectors such as pharmaceuticals and e-commerce to evaluate the efficacy of novel products or policies from offline datasets. This paper introduces a causal deepset framework that relaxes several key structural assumptions, primarily the mean-field assumption, prevalent in existing OPE methodologies that handle spatio-temporal interference. These traditional assumptions frequently prove inadequate in real-world settings, thereby restricting the capability of current OPE methods to effectively address complex interference effects. In response, we advocate for the implementation of the permutation invariance (PI) assumption. This innovative approach enables the data-driven, adaptive learning of the mean-field function, offering a more flexible estimation method beyond conventional averaging. Furthermore, we present novel algorithms that incorporate the PI assumption into OPE and thoroughly examine their theoretical foundations. Our numerical analyses demonstrate that this novel approach yields significantly more precise estimations than existing baseline algorithms, thereby substantially improving the practical applicability and effectiveness of OPE methodologies. A Python implementation of our proposed method is available at https://github.com/BIG-S2/Causal-Deepsets.

Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences

TL;DR

This work addresses off-policy evaluation under complex spatio-temporal interference by introducing a causal deepset framework that learns permutation-invariant mean-outcome functions without relying on parametric mean-field assumptions. The core innovation, the Permutation Invariant Estimator (PIE), aggregates neighbor information through learned neural blocks that are invariant to neighbor ordering, enabling flexible interference modeling. PIE is integrated with three OPE strategies (value-based, importance sampling, and doubly robust) and extended to dynamic settings via an MDP/MARL formulation, with theoretical guarantees including consistency, convergence rates, and minimax optimality. Empirical results across synthetic nondynamic and dynamic simulations, as well as real-data-based simulations, show substantial improvements over mean-field baselines, highlighting the approach’s practicality for ride-sharing and related spatial-temporal domains. A Python implementation is provided at the authors’ repository.

Abstract

Off-policy evaluation (OPE) is widely applied in sectors such as pharmaceuticals and e-commerce to evaluate the efficacy of novel products or policies from offline datasets. This paper introduces a causal deepset framework that relaxes several key structural assumptions, primarily the mean-field assumption, prevalent in existing OPE methodologies that handle spatio-temporal interference. These traditional assumptions frequently prove inadequate in real-world settings, thereby restricting the capability of current OPE methods to effectively address complex interference effects. In response, we advocate for the implementation of the permutation invariance (PI) assumption. This innovative approach enables the data-driven, adaptive learning of the mean-field function, offering a more flexible estimation method beyond conventional averaging. Furthermore, we present novel algorithms that incorporate the PI assumption into OPE and thoroughly examine their theoretical foundations. Our numerical analyses demonstrate that this novel approach yields significantly more precise estimations than existing baseline algorithms, thereby substantially improving the practical applicability and effectiveness of OPE methodologies. A Python implementation of our proposed method is available at https://github.com/BIG-S2/Causal-Deepsets.
Paper Structure (44 sections, 16 theorems, 203 equations, 6 figures, 1 algorithm)

This paper contains 44 sections, 16 theorems, 203 equations, 6 figures, 1 algorithm.

Key Result

Theorem 1

Assuming that Assumptions ass: neighbor and ass:PI hold, the mean outcome function $f_i$ can be accurately approximated by the following estimator, achieving any desired level of precision with the appropriate selection of the functions $\phi_i$ and $\psi_i$,

Figures (6)

  • Figure 1: The illustration of permutation invariant (PI) mean-outcome function. The red hexagon represents confounder-treatment pair of the central region while the other six hexagons represent its neighboring regions. The upper-right subplot shows the mean-outcome function with only general network interference assumption \ref{['ass: neighbor']}. Here, the output of $f$ changes across different permutations of neighboring regions. On the other hand, the subplot on the right bottom shows that with the permutation invariant assumption \ref{['ass:PI']}, all outputs have the same value.
  • Figure 2: Variable dependencies under different spatial interference structures. The dashed line represents the spatial interference effect. Each line represents a different region and horizontal adjacency depicts the proximity of regions.
  • Figure 3: The proposed structure is depicted in a graphical visualization. In this representation, the hexagonal prism at the bottom-right corner symbolizes the confounder-treatment vector for a central region (colored green) and its six neighboring regions (colored blue). The vectors from these neighboring regions are simultaneously input into the same neural network, denoted by $\psi$. The aggregated output from this process forms the PIE interference effect function, $m_{\text{PIE}, i}$. Subsequently, the central region's vector is concatenated with $m_{\text{PIE},i}$. The final estimator is then obtained after this combined vector is processed through a feedforward neural network, labeled as $\phi$.
  • Figure 4: Nondynamic simulation results: Mean Squared Errors (MSEs) of various policy value estimators are aggregated over 50 simulation replications. The top panels display results for $l=5$, while the bottom panels are for $l=10$. The left panels correspond to the linear setting, the middle panels to nonlinear setting I, and the right panels to nonlinear setting II.
  • Figure 5: Dynamic simulation results: MSEs of VB and DR estimators corresponding to different combinations of $l$ and $Q$.
  • ...and 1 more figures

Theorems & Definitions (22)

  • Definition 1: Permutation operator
  • Theorem 1: Permutation Invariant Estimator (PIE)
  • Definition 2: Permutation Invariant Functions
  • Theorem 2: Consistency of PIE
  • Theorem 3: Convergence rate of PIE
  • Theorem 4: Minimax Optimality of PIE
  • Corollary 1: Convergence Rate of PIE
  • Corollary 2
  • Corollary 3
  • Lemma 1: Symmetrization, Lemma 5 in Tengyuan
  • ...and 12 more