Table of Contents
Fetching ...

Spatial-temporal Forecasting for Regions without Observations

Xinyu Su, Jianzhong Qi, Egemen Tanin, Yanchuan Chang, Majid Sarvi

TL;DR

This work tackles forecasting for regions without historical observations by proposing STSM, a selective-masking framework that learns from similar, data-rich neighboring regions. The method combines sub-graph masking guided by regional, road-network, and distance cues with graph contrastive learning and a temporal-similarity aware spatial-temporal model, supported by pseudo-observations for unobserved sites. Empirical results on traffic and air-quality datasets show STSM consistently outperforms adapted state-of-the-art approaches, driven by selective masking, temporal similarity adjacency, and contrastive learning. The approach enables accurate predictions under continuous data scarcity and has practical implications for unbalanced development and open-data limitations.

Abstract

Spatial-temporal forecasting plays an important role in many real-world applications, such as traffic forecasting, air pollutant forecasting, crowd-flow forecasting, and so on. State-of-the-art spatial-temporal forecasting models take data-driven approaches and rely heavily on data availability. Such models suffer from accuracy issues when data is incomplete, which is common in reality due to the heavy costs of deploying and maintaining sensors for data collection. A few recent studies attempted to address the issue of incomplete data. They typically assume some data availability in a region of interest either for a short period or at a few locations. In this paper, we further study spatial-temporal forecasting for a region of interest without any historical observations, to address scenarios such as unbalanced region development, progressive deployment of sensors or lack of open data. We propose a model named STSM for the task. The model takes a contrastive learning-based approach to learn spatial-temporal patterns from adjacent regions that have recorded data. Our key insight is to learn from the locations that resemble those in the region of interest, and we propose a selective masking strategy to enable the learning. As a result, our model outperforms adapted state-of-the-art models, reducing errors consistently over both traffic and air pollutant forecasting tasks. The source code is available at https://github.com/suzy0223/STSM.

Spatial-temporal Forecasting for Regions without Observations

TL;DR

This work tackles forecasting for regions without historical observations by proposing STSM, a selective-masking framework that learns from similar, data-rich neighboring regions. The method combines sub-graph masking guided by regional, road-network, and distance cues with graph contrastive learning and a temporal-similarity aware spatial-temporal model, supported by pseudo-observations for unobserved sites. Empirical results on traffic and air-quality datasets show STSM consistently outperforms adapted state-of-the-art approaches, driven by selective masking, temporal similarity adjacency, and contrastive learning. The approach enables accurate predictions under continuous data scarcity and has practical implications for unbalanced development and open-data limitations.

Abstract

Spatial-temporal forecasting plays an important role in many real-world applications, such as traffic forecasting, air pollutant forecasting, crowd-flow forecasting, and so on. State-of-the-art spatial-temporal forecasting models take data-driven approaches and rely heavily on data availability. Such models suffer from accuracy issues when data is incomplete, which is common in reality due to the heavy costs of deploying and maintaining sensors for data collection. A few recent studies attempted to address the issue of incomplete data. They typically assume some data availability in a region of interest either for a short period or at a few locations. In this paper, we further study spatial-temporal forecasting for a region of interest without any historical observations, to address scenarios such as unbalanced region development, progressive deployment of sensors or lack of open data. We propose a model named STSM for the task. The model takes a contrastive learning-based approach to learn spatial-temporal patterns from adjacent regions that have recorded data. Our key insight is to learn from the locations that resemble those in the region of interest, and we propose a selective masking strategy to enable the learning. As a result, our model outperforms adapted state-of-the-art models, reducing errors consistently over both traffic and air pollutant forecasting tasks. The source code is available at https://github.com/suzy0223/STSM.
Paper Structure (30 sections, 18 equations, 11 figures, 11 tables)

This paper contains 30 sections, 18 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: Problem setting comparison. Coloured maps and grey maps indicate data observed and unobserved, respectively. Our focus is Case (c).
  • Figure 2: Model architecture of STSM-RNC. The model contains a sub-graph masking module and a spatial-temporal modelling module.
  • Figure 3: The structure of the spatial-temporal model
  • Figure 4: Model architecture of STSM. The model contains three main parts. (1) The selective masking module leverages the regional and road network representations and the spatial distances to compute the similarity between observed locations (i.e., their sub-graphs) and the unobserved region. Masking probabilities are assigned based on the similarity scores. (2) The contrastive learning module guides STSM to make similar predictions for location graphs with complete data and graphs with incomplete data. (3) The spatial-temporal modelling module (as described in Section \ref{['subsec:st_model']}) utilises GCNs and 1-D TCNs to model spatial and temporal features, together with a contrastive learning loss to optimise the model. To enhance model performance, STSM generates pseudo-observations for unobserved locations and computes a temporal similarity-based adjacency matrix. During the testing process, STSM fills unobserved locations with pseudo-observations and then feeds the graph into ST-Model to obtain the prediction results.
  • Figure 5: Visualisations of sensor distribution
  • ...and 6 more figures