Table of Contents
Fetching ...

Can We Validate Counterfactual Estimations in the Presence of General Network Interference?

Sadegh Shirani, Yuwei Luo, William Overman, Ruoxuan Xiong, Mohsen Bayati

TL;DR

This work addresses validating counterfactual estimates under general network interference by combining a distribution-preserving network bootstrap (DPNB) with a counterfactual cross-validation (C-CV) framework. It generalizes causal message-passing to heterogeneous units and leverages a finite-sample, non-asymptotic SE analysis to learn invariant propagation mappings $f_t$ that generate counterfactual evolutions $ ext{CFE}(m{w}')$. A benchmark toolbox with six semi-synthetic environments provides ground-truth validation and broadens evaluation of causal methods in networked settings. The results show robust, data-driven estimation and validation across diverse interference structures and temporal patterns, enabling more reliable decision-making in online platforms, public health, and complex systems. The approach lays a principled foundation for suspecting and correcting bias (via $oldsymbol{}$ and $ar{ ho}$ terms) while boosting sample efficiency through DPNB and data-driven model selection via C-CV.

Abstract

Randomized experiments have become a cornerstone of evidence-based decision-making in contexts ranging from online platforms to public health. However, in experimental settings with network interference, a unit's treatment can influence outcomes of other units, challenging both causal effect estimation and its validation. Classic validation approaches fail as outcomes are only observable under a single treatment scenario and exhibit complex correlation patterns due to interference. To address these challenges, we introduce a framework that facilitates the use of machine learning tools for both estimation and validation in causal inference. Central to our approach is the new distribution-preserving network bootstrap, a theoretically-grounded technique that generates multiple statistically-valid subpopulations from a single experiment's data. This amplification of experimental samples enables our second contribution: a counterfactual cross-validation procedure. This procedure adapts the principles of model validation to the unique constraints of causal settings, providing a rigorous, data-driven method for selecting and evaluating estimators. We extend recent causal message-passing developments by incorporating heterogeneous unit-level characteristics and varying local interactions, ensuring reliable finite-sample performance through non-asymptotic analysis. Additionally, we develop and publicly release a comprehensive benchmark toolbox featuring diverse experimental environments, from networks of interacting AI agents to ride-sharing applications. These environments provide known ground truth values while maintaining realistic complexities, enabling systematic evaluation of causal inference methods. Extensive testing across these environments demonstrates our method's robustness to diverse forms of network interference.

Can We Validate Counterfactual Estimations in the Presence of General Network Interference?

TL;DR

This work addresses validating counterfactual estimates under general network interference by combining a distribution-preserving network bootstrap (DPNB) with a counterfactual cross-validation (C-CV) framework. It generalizes causal message-passing to heterogeneous units and leverages a finite-sample, non-asymptotic SE analysis to learn invariant propagation mappings that generate counterfactual evolutions . A benchmark toolbox with six semi-synthetic environments provides ground-truth validation and broadens evaluation of causal methods in networked settings. The results show robust, data-driven estimation and validation across diverse interference structures and temporal patterns, enabling more reliable decision-making in online platforms, public health, and complex systems. The approach lays a principled foundation for suspecting and correcting bias (via and terms) while boosting sample efficiency through DPNB and data-driven model selection via C-CV.

Abstract

Randomized experiments have become a cornerstone of evidence-based decision-making in contexts ranging from online platforms to public health. However, in experimental settings with network interference, a unit's treatment can influence outcomes of other units, challenging both causal effect estimation and its validation. Classic validation approaches fail as outcomes are only observable under a single treatment scenario and exhibit complex correlation patterns due to interference. To address these challenges, we introduce a framework that facilitates the use of machine learning tools for both estimation and validation in causal inference. Central to our approach is the new distribution-preserving network bootstrap, a theoretically-grounded technique that generates multiple statistically-valid subpopulations from a single experiment's data. This amplification of experimental samples enables our second contribution: a counterfactual cross-validation procedure. This procedure adapts the principles of model validation to the unique constraints of causal settings, providing a rigorous, data-driven method for selecting and evaluating estimators. We extend recent causal message-passing developments by incorporating heterogeneous unit-level characteristics and varying local interactions, ensuring reliable finite-sample performance through non-asymptotic analysis. Additionally, we develop and publicly release a comprehensive benchmark toolbox featuring diverse experimental environments, from networks of interacting AI agents to ride-sharing applications. These environments provide known ground truth values while maintaining realistic complexities, enabling systematic evaluation of causal inference methods. Extensive testing across these environments demonstrates our method's robustness to diverse forms of network interference.

Paper Structure

This paper contains 43 sections, 18 theorems, 97 equations, 29 figures, 2 tables, 8 algorithms.

Key Result

Theorem 3.1

Suppose that functions $g_{}^{}$ and $h_{}^{}$ are affine in their first argument.There exist functions $g_{}^{} , g_{}^{} , h_{}^{}$ and $h_{}^{}$ such that $g_{}^{} (y, \cdot) = y g_{}^{} (\cdot) + g_{}^{} (\cdot)$ as well as $h_{}^{} (y, \cdot) = y h_{}^{} (\cdot) + h_{}^{} (\cdot)$. Under certai where the convergence is in probability. When the estimation of the mappings $f_{t}$ achieves stron

Figures (29)

  • Figure 1: Validating Counterfactual Estimations: (1) Network interventions evolve through a propagation phase, governed by invariant mathematical rules known as state evolution equations. (2) Distribution-preserving network bootstrap generates different samples that retain the statistical properties of the original experimental population. (3) Counterfactual cross-validation partitions the time horizon to train and validate estimation models. (4) Six experimental environments spanning diverse domains and network structures to test estimation methods.
  • Figure 2: Evolution of outcomes sample mean (z-axis) with respect to time $t$ and treatment probability $p$. Red and blue contours highlight the counterfactual evolutions at treatment probabilities $p=0.25$ and $p=0.75$, respectively. The magenta line represents the equilibrium state, where the treatment effect has stabilized.
  • Figure 3: Estimation strategy: Experimental data is used to estimate state evolution mappings through supervised learning, which are then applied recursively to generate desired counterfactual evolutions. Treatment allocations $\bm{w}$ and $\bm{w'}$ share identical initial columns, serving as initialization for our recursive approach. While observations are collected at the unit level, the estimated counterfactuals represent population-level quantities.
  • Figure 4: Difference‑in‑Means (DM) performance under interference. Left: $\sigma$ sweep at fixed $\mu=0.04$. Right: $\mu$ sweep at fixed $\sigma=0.5$. Curves show MSE, variance, and squared bias with $\pm 1$ SE bands from a nested bootstrap. Axes are log–log.
  • Figure 5: Counterfactual Cross-Validation. Time horizon is partitioned into blocks for leave-one-out validation. Models are trained on remaining blocks and evaluated via MSE to select optimal configurations.
  • ...and 24 more figures

Theorems & Definitions (36)

  • Remark 3.1
  • Remark 3.2
  • Example 3.1
  • Remark 3.3
  • Theorem 3.1: Consistency - informal statement
  • Proposition 3.1
  • Remark 4.1
  • Theorem 4.1
  • Example 4.1
  • Theorem 4.2: Unit-level decomposition rule
  • ...and 26 more