Table of Contents
Fetching ...

Seeing the Unseen: Learning Basis Confounder Representations for Robust Traffic Prediction

Jiahao Ji, Wentao Zhang, Jingyuan Wang, Chao Huang

TL;DR

Traffic forecasting is challenged by continuous external confounders that modulate X → Y; the paper introduces STEVE, which learns a base confounder bank of K vectors and represents any confounder as a weighted combination, guided by cross-attention. It then uses confounder-oriented self-supervised learning, adversarial disentanglement, and mutual information minimization to separate confounder effects from confounder-irrelevant relations and fuses both signals for robust forecasting. Empirical results across four large datasets show STEVE consistently outperforms baselines under distribution shifts and demonstrates strong generalization to unseen confounders, along with favorable training efficiency and scalability. The work advances practical counterfactual robustness in deep spatiotemporal forecasting by extending back-door ideas to continuous, latent confounders with a learnable basis representation and targeted self-supervision.

Abstract

Traffic prediction is essential for intelligent transportation systems and urban computing. It aims to establish a relationship between historical traffic data X and future traffic states Y by employing various statistical or deep learning methods. However, the relations of X -> Y are often influenced by external confounders that simultaneously affect both X and Y , such as weather, accidents, and holidays. Existing deep-learning traffic prediction models adopt the classic front-door and back-door adjustments to address the confounder issue. However, these methods have limitations in addressing continuous or undefined confounders, as they depend on predefined discrete values that are often impractical in complex, real-world scenarios. To overcome this challenge, we propose the Spatial-Temporal sElf-superVised confoundEr learning (STEVE) model. This model introduces a basis vector approach, creating a base confounder bank to represent any confounder as a linear combination of a group of basis vectors. It also incorporates self-supervised auxiliary tasks to enhance the expressive power of the base confounder bank. Afterward, a confounder-irrelevant relation decoupling module is adopted to separate the confounder effects from direct X -> Y relations. Extensive experiments across four large-scale datasets validate our model's superior performance in handling spatial and temporal distribution shifts and underscore its adaptability to unseen confounders. Our model implementation is available at https://github.com/bigscity/STEVE_CODE.

Seeing the Unseen: Learning Basis Confounder Representations for Robust Traffic Prediction

TL;DR

Traffic forecasting is challenged by continuous external confounders that modulate X → Y; the paper introduces STEVE, which learns a base confounder bank of K vectors and represents any confounder as a weighted combination, guided by cross-attention. It then uses confounder-oriented self-supervised learning, adversarial disentanglement, and mutual information minimization to separate confounder effects from confounder-irrelevant relations and fuses both signals for robust forecasting. Empirical results across four large datasets show STEVE consistently outperforms baselines under distribution shifts and demonstrates strong generalization to unseen confounders, along with favorable training efficiency and scalability. The work advances practical counterfactual robustness in deep spatiotemporal forecasting by extending back-door ideas to continuous, latent confounders with a learnable basis representation and targeted self-supervision.

Abstract

Traffic prediction is essential for intelligent transportation systems and urban computing. It aims to establish a relationship between historical traffic data X and future traffic states Y by employing various statistical or deep learning methods. However, the relations of X -> Y are often influenced by external confounders that simultaneously affect both X and Y , such as weather, accidents, and holidays. Existing deep-learning traffic prediction models adopt the classic front-door and back-door adjustments to address the confounder issue. However, these methods have limitations in addressing continuous or undefined confounders, as they depend on predefined discrete values that are often impractical in complex, real-world scenarios. To overcome this challenge, we propose the Spatial-Temporal sElf-superVised confoundEr learning (STEVE) model. This model introduces a basis vector approach, creating a base confounder bank to represent any confounder as a linear combination of a group of basis vectors. It also incorporates self-supervised auxiliary tasks to enhance the expressive power of the base confounder bank. Afterward, a confounder-irrelevant relation decoupling module is adopted to separate the confounder effects from direct X -> Y relations. Extensive experiments across four large-scale datasets validate our model's superior performance in handling spatial and temporal distribution shifts and underscore its adaptability to unseen confounders. Our model implementation is available at https://github.com/bigscity/STEVE_CODE.
Paper Structure (37 sections, 29 equations, 10 figures, 5 tables)

This paper contains 37 sections, 29 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Structural causal model for traffic forecasting.
  • Figure 2: The pipeline of our STEVE model. Repr: Representation. TCL: Temporal Convolutional Layer. GCL: Graph Convolutional Layer. Info: Information. COSSL: Confounder-Oriented Self-Supervised Learning. We omit the sample index of all variables for simplicity. Fig. \ref{['fig:ce']} illustrates the details of the confounder extractor.
  • Figure 3: The architecture of our confounder extractor. Avg: Average. Att: Attention. For simplicity, the sample index $t$ for $\bm{\mathcal{Z}}$ and $\bm{C}$ is omitted.
  • Figure 4: Adversarial learning is achieved by inserting a GRL between the generator $g_\theta$ and the discriminator $g_\psi$. The forward pass is indicated by arrows while the backward pass is indicated by dashed arrows.
  • Figure 5: Confounder distribution of distinct locations at different time periods. JS Div: Jensen–Shannon divergence.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Definition 1: Traffic Graph
  • Definition 2: Traffic State