Table of Contents
Fetching ...

Generalising Traffic Forecasting to Regions without Traffic Observations

Xinyu Su, Majid Sarvi, Feng Liu, Egemen Tanin, Jianzhong Qi

TL;DR

This work targets traffic forecasting in regions lacking sensor coverage by introducing GenCast, a framework that fuses physics-informed priors with external weather signals and a spatial grouping mechanism to improve generalisation to unobserved regions. It employs a differentiable spatial-temporal encoder and two spatial embedding schemes (LLM-based SE-L and GeoHash-based SE-H), plus a cross-domain weather encoder and a physics-based loss derived from the Lighthill-Whitham-Richards model to regularise learning. A contrastive backbone with masked subgraphs promotes robustness to unseen regions, while the spatial grouping module filters localised noise to strengthen transferable patterns. Experimental results across multiple real-world datasets show GenCast variants consistently outperform state-of-the-art baselines, achieving up to 3.1% RMSE reduction and up to 125.6% $R^2$ improvement, demonstrating strong generalisability for unobserved regions and practical impact for wide-area traffic forecasting.

Abstract

Traffic forecasting is essential for intelligent transportation systems. Accurate forecasting relies on continuous observations collected by traffic sensors. However, due to high deployment and maintenance costs, not all regions are equipped with such sensors. This paper aims to forecast for regions without traffic sensors, where the lack of historical traffic observations challenges the generalisability of existing models. We propose a model named GenCast, the core idea of which is to exploit external knowledge to compensate for the missing observations and to enhance generalisation. We integrate physics-informed neural networks into GenCast, enabling physical principles to regularise the learning process. We introduce an external signal learning module to explore correlations between traffic states and external signals such as weather conditions, further improving model generalisability. Additionally, we design a spatial grouping module to filter localised features that hinder model generalisability. Extensive experiments show that GenCast consistently reduces forecasting errors on multiple real-world datasets.

Generalising Traffic Forecasting to Regions without Traffic Observations

TL;DR

This work targets traffic forecasting in regions lacking sensor coverage by introducing GenCast, a framework that fuses physics-informed priors with external weather signals and a spatial grouping mechanism to improve generalisation to unobserved regions. It employs a differentiable spatial-temporal encoder and two spatial embedding schemes (LLM-based SE-L and GeoHash-based SE-H), plus a cross-domain weather encoder and a physics-based loss derived from the Lighthill-Whitham-Richards model to regularise learning. A contrastive backbone with masked subgraphs promotes robustness to unseen regions, while the spatial grouping module filters localised noise to strengthen transferable patterns. Experimental results across multiple real-world datasets show GenCast variants consistently outperform state-of-the-art baselines, achieving up to 3.1% RMSE reduction and up to 125.6% improvement, demonstrating strong generalisability for unobserved regions and practical impact for wide-area traffic forecasting.

Abstract

Traffic forecasting is essential for intelligent transportation systems. Accurate forecasting relies on continuous observations collected by traffic sensors. However, due to high deployment and maintenance costs, not all regions are equipped with such sensors. This paper aims to forecast for regions without traffic sensors, where the lack of historical traffic observations challenges the generalisability of existing models. We propose a model named GenCast, the core idea of which is to exploit external knowledge to compensate for the missing observations and to enhance generalisation. We integrate physics-informed neural networks into GenCast, enabling physical principles to regularise the learning process. We introduce an external signal learning module to explore correlations between traffic states and external signals such as weather conditions, further improving model generalisability. Additionally, we design a spatial grouping module to filter localised features that hinder model generalisability. Extensive experiments show that GenCast consistently reduces forecasting errors on multiple real-world datasets.

Paper Structure

This paper contains 60 sections, 22 equations, 15 figures, 6 tables.

Figures (15)

  • Figure 1: Illustration of the problem setting and modelling strategies. Upper: (a) Scattered unobserved locations vs. (b) Unobserved region (our focus). Blue bubbles represent observed locations; red hollow bubbles denote unobserved ones; and red solid bubbles denote the target forecasts. Lower: Comparison of different modelling strategies: (c) Incorporating static auxiliary features (e.g., learning from locations with similar POIs or geo-coordinates); (d) Integrates physics priors and dynamic external signals (e.g., weather) to improve generalisation (ours).
  • Figure 2: Overview of GenCast. Given observed traffic data $G_o$, a masked view $G_o^m$ is generated by randomly masking a subgraph. Temporal and spatial embeddings $\mathbf{TE}_{enc}$ and $\mathbf{L}_{enc}$ are fused with inputs $\mathbf{X}_{G_o^m}^{t-T+1:t}$ and $\mathbf{X}_{G_o}^{t-T+1:t}$ via a spatial-temporal embedding (STE) layer to produce initial features $\mathbf{H}^0_m$ and $\mathbf{H}^0$, respectively. External signals (weather) are matched to graph nodes via geo-coordinates and are integrated with the node features using an external signal encoder, yielding $\mathbf{H}^0_{m, fuse}$ and $\mathbf{H}^0_{fuse}$.These feature matrices are passed to a spatial-temporal (ST) model with a spatial grouping module, producing forecasts, graph representations, and learnable soft grouping scores for each node. Three loss terms are used: forecast loss $L_{pred}$ measures forecast errors; contrastive loss $L_{cl}$ measures node representation consistency across $G^m_o$ and $G_o$; and group-aware loss $L_{spg}$ measures group assignment confidence. The physics module computes a residual $R$ based on LWR to form a fourth (i.e., physics) loss $L_{phy}$, to regularise GenCast by physical principles of traffic dynamics.
  • Figure 3: Ablation study results. We include results on the other datasets in the appendix. Same below.
  • Figure 4: SPG entropy: GenCast-L vs. w/o-spg (PEMS07).
  • Figure 5: Model performance vs. unobserved ratio.
  • ...and 10 more figures