Spatio-Temporal Hierarchical Causal Models
Xintong Li, Haoran Zhang, Xiao Zhou
TL;DR
ST-HCMs address causal inference in nested spatio-temporal data with unobserved unit-level confounders by extending hierarchical causal modeling to the spatio-temporal domain. The core theoretical contribution is the Spatio-Temporal Collapse Theorem, which proves convergence of ST-HCMs to a Dynamic Collapsed Model as subunit data grows, enabling causal identification via sequential adjustability or dynamic instruments. The paper also provides an estimation algorithm using unit-specific conditional dynamics and G-computation, adaptable to Linear Mixed-Effects Models, GBMs, or Gaussian Processes. Empirical studies on synthetic data and a Chicago urban-traffic dataset demonstrate accurate, robust causal estimates and practical gains from modeling both hierarchy and spatial spillovers.
Abstract
The abundance of fine-grained spatio-temporal data, such as traffic sensor networks, offers vast opportunities for scientific discovery. However, inferring causal relationships from such observational data remains challenging, particularly due to unobserved confounders that are specific to units (e.g., geographical locations) yet influence outcomes over time. Most existing methods for spatio-temporal causal inference assume that all confounders are observed, an assumption that is often violated in practice. In this paper, we introduce Spatio-Temporal Hierarchical Causal Models (ST-HCMs), a novel graphical framework that extends hierarchical causal modeling to the spatio-temporal domain. At the core of our approach is the Spatio-Temporal Collapse Theorem, which shows that a complex ST-HCM converges to a simpler flat causal model as the amount of subunit data increases. This theoretical result enables a general procedure for causal identification, allowing ST-HCMs to recover causal effects even in the presence of unobserved, time-invariant unit-level confounders, a scenario where standard non-hierarchical models fail. We validate the effectiveness of our framework on both synthetic and real-world datasets, demonstrating its potential for robust causal inference in complex dynamic systems.
