Table of Contents
Fetching ...

GST-UNet: A Neural Framework for Spatiotemporal Causal Inference with Time-Varying Confounding

Miruna Oprescu, David K. Park, Xihaier Luo, Shinjae Yoo, Nathan Kallus

TL;DR

GST-UNet addresses causal inference in spatiotemporal data with time-varying confounding and interference, enabling estimation of location-specific potential outcomes from a single observed trajectory. It fuses a U-Net–ConvLSTM spatiotemporal encoder with iterative G-computation heads, under a representation-based time-invariance embedding and a curriculum-based training regime to stabilize learning of recursive pseudo-outcomes. The approach provides identification and consistency guarantees and is validated on synthetic data and a real-world Camp Fire health analysis, showing superior accuracy and stable counterfactual estimates compared to baselines. This yields a principled, ready-to-use tool for policy-relevant and scientific studies involving complex spatiotemporal causal effects. The combination of theory, architecture, and empirical results advances reliable spatiotemporal causal inference in domains like public health and environmental policy.

Abstract

Estimating causal effects from spatiotemporal observational data is essential in public health, environmental science, and policy evaluation, where randomized experiments are often infeasible. Existing approaches, however, either rely on strong structural assumptions or fail to handle key challenges such as interference, spatial confounding, temporal carryover, and time-varying confounding -- where covariates are influenced by past treatments and, in turn, affect future ones. We introduce GST-UNet (G-computation Spatio-Temporal UNet), a theoretically grounded neural framework that combines a U-Net-based spatiotemporal encoder with regression-based iterative G-computation to estimate location-specific potential outcomes under complex intervention sequences. GST-UNet explicitly adjusts for time-varying confounders and captures non-linear spatial and temporal dependencies, enabling valid causal inference from a single observed trajectory in data-scarce settings. We validate its effectiveness in synthetic experiments and in a real-world analysis of wildfire smoke exposure and respiratory hospitalizations during the 2018 California Camp Fire. Together, these results position GST-UNet as a principled and ready-to-use framework for spatiotemporal causal inference, advancing reliable estimation in policy-relevant and scientific domains.

GST-UNet: A Neural Framework for Spatiotemporal Causal Inference with Time-Varying Confounding

TL;DR

GST-UNet addresses causal inference in spatiotemporal data with time-varying confounding and interference, enabling estimation of location-specific potential outcomes from a single observed trajectory. It fuses a U-Net–ConvLSTM spatiotemporal encoder with iterative G-computation heads, under a representation-based time-invariance embedding and a curriculum-based training regime to stabilize learning of recursive pseudo-outcomes. The approach provides identification and consistency guarantees and is validated on synthetic data and a real-world Camp Fire health analysis, showing superior accuracy and stable counterfactual estimates compared to baselines. This yields a principled, ready-to-use tool for policy-relevant and scientific studies involving complex spatiotemporal causal effects. The combination of theory, architecture, and empirical results advances reliable spatiotemporal causal inference in domains like public health and environmental policy.

Abstract

Estimating causal effects from spatiotemporal observational data is essential in public health, environmental science, and policy evaluation, where randomized experiments are often infeasible. Existing approaches, however, either rely on strong structural assumptions or fail to handle key challenges such as interference, spatial confounding, temporal carryover, and time-varying confounding -- where covariates are influenced by past treatments and, in turn, affect future ones. We introduce GST-UNet (G-computation Spatio-Temporal UNet), a theoretically grounded neural framework that combines a U-Net-based spatiotemporal encoder with regression-based iterative G-computation to estimate location-specific potential outcomes under complex intervention sequences. GST-UNet explicitly adjusts for time-varying confounders and captures non-linear spatial and temporal dependencies, enabling valid causal inference from a single observed trajectory in data-scarce settings. We validate its effectiveness in synthetic experiments and in a real-world analysis of wildfire smoke exposure and respiratory hospitalizations during the 2018 California Camp Fire. Together, these results position GST-UNet as a principled and ready-to-use framework for spatiotemporal causal inference, advancing reliable estimation in policy-relevant and scientific domains.

Paper Structure

This paper contains 24 sections, 3 theorems, 30 equations, 6 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

Assume that assump:standard and assump:embedding hold. Further, let $\mathbf{H}^\mathbf{a}_{1:t+k}:=(\mathbf{X}_{1:t+k}, [\mathbf{A}_{1:t-1}, \mathbf{a}_{t:t+k-1}], \mathbf{Y}_{1:t+k})$ denote the history where observed treatments from time $t$ onward are replaced by $\mathbf{a}_{t:t+k-1}$. Define r Then $\mathbb{E}[\mathbf{Y}_{t+\tau}[\mathbf{a}_{t:t+\tau-1}]\mid \mathbf{H}_{1:t}=\mathbf{h}_{1:t}

Figures (6)

  • Figure 1: Observational data (left) versus interventional data (right) for a horizon $\tau=2$ across multiple locations $(s,s')$. Under the intervention (right), treatments are set independently of confounders, and the full history is not observed for the entire horizon.
  • Figure 2: Overview of the GST-UNet architecture. The spatiotemporal learning module (left) is a U-Net augmented with a ConvLSTM layer and attention gates. Its final feature map is passed to a set of G-heads (right), where each G-head $Q_k$ implements iterative G-computation (see \ref{['alg:gstunet']}).
  • Figure 3: (Left) Daily PM2.5 levels across California from May to December 2018, with red lines marking major wildfires. (Center) Counties exposed to average PM2.5$>$ 10 µ g/m3 during the Camp Fire (red), origin county in dark red. (Right) Factual minus CAPO‐predicted daily respiratory admissions during peak Camp Fire. Hashed areas indicate small-population counties ($<30{,}000$).
  • Figure 4: Samples from the DGP at $t=100$, comparing feature $X_{100}$ (left), intervention $A_{100}$ (center), and outcome $Y_{101}$ (right) for varying $\beta_1\in\{0.0,1.0,2.0\}$.
  • Figure 5: (Left) Daily respiratory illness incidence (cases per 10,000). (Center) Weekly aggregated incidence. (Right) Average daily PM2.5 during the Camp Fire.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Theorem 1: Identification with G-Computation
  • Theorem 2: Consistency of Iterative G-Computation in Spatiotemporal Settings
  • Definition 1: Stochastic equicontinuity van1996weak
  • Theorem 3: Consistency under Uniform Stochastic Equicontinuity
  • Example 1: Feed-forward or convolutional heads