Can We Validate Counterfactual Estimations in the Presence of General Network Interference?

Sadegh Shirani; Yuwei Luo; William Overman; Ruoxuan Xiong; Mohsen Bayati

Can We Validate Counterfactual Estimations in the Presence of General Network Interference?

Sadegh Shirani, Yuwei Luo, William Overman, Ruoxuan Xiong, Mohsen Bayati

TL;DR

This work addresses validating counterfactual estimates under general network interference by combining a distribution-preserving network bootstrap (DPNB) with a counterfactual cross-validation (C-CV) framework. It generalizes causal message-passing to heterogeneous units and leverages a finite-sample, non-asymptotic SE analysis to learn invariant propagation mappings $f_t$ that generate counterfactual evolutions $ ext{CFE}(m{w}')$. A benchmark toolbox with six semi-synthetic environments provides ground-truth validation and broadens evaluation of causal methods in networked settings. The results show robust, data-driven estimation and validation across diverse interference structures and temporal patterns, enabling more reliable decision-making in online platforms, public health, and complex systems. The approach lays a principled foundation for suspecting and correcting bias (via $oldsymbol{}$ and $ar{ ho}$ terms) while boosting sample efficiency through DPNB and data-driven model selection via C-CV.

Abstract

Randomized experiments have become a cornerstone of evidence-based decision-making in contexts ranging from online platforms to public health. However, in experimental settings with network interference, a unit's treatment can influence outcomes of other units, challenging both causal effect estimation and its validation. Classic validation approaches fail as outcomes are only observable under a single treatment scenario and exhibit complex correlation patterns due to interference. To address these challenges, we introduce a framework that facilitates the use of machine learning tools for both estimation and validation in causal inference. Central to our approach is the new distribution-preserving network bootstrap, a theoretically-grounded technique that generates multiple statistically-valid subpopulations from a single experiment's data. This amplification of experimental samples enables our second contribution: a counterfactual cross-validation procedure. This procedure adapts the principles of model validation to the unique constraints of causal settings, providing a rigorous, data-driven method for selecting and evaluating estimators. We extend recent causal message-passing developments by incorporating heterogeneous unit-level characteristics and varying local interactions, ensuring reliable finite-sample performance through non-asymptotic analysis. Additionally, we develop and publicly release a comprehensive benchmark toolbox featuring diverse experimental environments, from networks of interacting AI agents to ride-sharing applications. These environments provide known ground truth values while maintaining realistic complexities, enabling systematic evaluation of causal inference methods. Extensive testing across these environments demonstrates our method's robustness to diverse forms of network interference.

Can We Validate Counterfactual Estimations in the Presence of General Network Interference?

TL;DR

Abstract

Can We Validate Counterfactual Estimations in the Presence of General Network Interference?

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (29)

Theorems & Definitions (36)