Network Causal Effect Estimation In Graphical Models Of Contagion And Latent Confounding
Yufeng Wu, Rohit Bhattacharya
TL;DR
We address distinguishing contagion from latent confounding in networks under full interference using segregated graphs (SGs) to represent mechanisms. We show identifiability of the target E[Y_i | do(A=a)] and its variants, and develop coding-likelihood based likelihood ratio tests to distinguish contagion versus latent confounding in each layer. We extend auto-g-computation to handle latent confounding, with Gibbs-sampler-based generation of latent variables and a local-structure parameterization for p(L) and p(Y | A, L), ensuring unbiased and consistent network-effect estimates when mechanism inference is correct. We provide network-asymptotic guarantees and validate the methods on synthetic, semi-synthetic, and real networks, highlighting practical conditions under which six-degree-separated samples support reliable testing and estimation.
Abstract
A key question in many network studies is whether the observed correlations between units are primarily due to contagion or latent confounding. Here, we study this question using a segregated graph (Shpitser, 2015) representation of these mechanisms, and examine how uncertainty about the true underlying mechanism impacts downstream computation of network causal effects, particularly under full interference -- settings where we only have a single realization of a network and each unit may depend on any other unit in the network. Under certain assumptions about asymptotic growth of the network, we derive likelihood ratio tests that can be used to identify whether different sets of variables -- confounders, treatments, and outcomes -- across units exhibit dependence due to contagion or latent confounding. We then propose network causal effect estimation strategies that provide unbiased and consistent estimates if the dependence mechanisms are either known or correctly inferred using our proposed tests. Together, the proposed methods allow network effect estimation in a wider range of full interference scenarios that have not been considered in prior work. We evaluate the effectiveness of our methods with synthetic data and the validity of our assumptions using real-world networks.
