STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning

Wei Shao; Yufan Kang; Ziyan Peng; Xiao Xiao; Lei Wang; Yuhui Yang; Flora D Salim

STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning

Wei Shao, Yufan Kang, Ziyan Peng, Xiao Xiao, Lei Wang, Yuhui Yang, Flora D Salim

TL;DR

STEMO addresses the challenge of making timely yet accurate spatio-temporal forecasts by casting early prediction as a multi-objective reinforcement learning problem. It introduces a spatio-temporal predictor based on Multi-Graph Convolutional Networks, a state-generation mechanism from biased random walks, and a policy learner that optimizes per-node forecast times under learned or provided preferences, including a hidden-preference discovery component. The approach yields superior Pareto-front performance (higher hypervolume and better spacing) on METR-LA, EMS, and NYPD datasets and demonstrates the value of adaptive timing over fixed prediction horizons. The work advances practical, data-driven balancing of timeliness and accuracy in critical forecasting tasks and provides a framework for discovering task-specific preferences in MORL settings.

Abstract

Accuracy and timeliness are indeed often conflicting goals in prediction tasks. Premature predictions may yield a higher rate of false alarms, whereas delaying predictions to gather more information can render them too late to be useful. In applications such as wildfires, crimes, and traffic jams, timely forecasting are vital for safeguarding human life and property. Consequently, finding a balance between accuracy and timeliness is crucial. In this paper, we propose an early spatio-temporal forecasting model based on Multi-Objective reinforcement learning that can either implement an optimal policy given a preference or infer the preference based on a small number of samples. The model addresses two primary challenges: 1) enhancing the accuracy of early forecasting and 2) providing the optimal policy for determining the most suitable prediction time for each area. Our method demonstrates superior performance on three large-scale real-world datasets, surpassing existing methods in early spatio-temporal forecasting tasks.

STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning

TL;DR

Abstract

Paper Structure (22 sections, 12 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 22 sections, 12 equations, 4 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Spatio-temporal Prediction
Early Prediction
Problem Definition
Methodology
The Spatio-temporal Predictor
State Generator
Finding the Optimal Set of Policies
Finding the Hidden Preference
Experiment
Experiment Settings
Datasets
Parameters Settings
Baselines and Metrics
...and 7 more sections

Figures (4)

Figure 1: Example of three methods for early spatio-temporal forecasting. Each circle is a node (e.g. sensor) with recorded value (such as speed) over time. The blue plane represents the prediction time, and the recorded values after the blue plane are not used for forecasting. The adaptive early forecasting method adjusts data usage and dynamically determines the prediction time for different nodes. The node colour at time $T$ indicates the accuracy, green indicates that the predicted value matches the ground truth, and red indicates otherwise.
Figure 2: At time $t$, the encoder processes the recorded values $\mathbf{X}_{0:t}$ to extract spatio-temporal features and generate the hidden state $\mathbf{H}_t$. Using $\mathbf{H}_t$, the decoder generates a series of forecasted values, focusing on $\mathbf{\widehat{X}}^{(t)}_{T}$. The state generator concatenates the node embedding result and $\mathbf{H}_t$ to generate the state $\mathbf{s}_t$. The policy utilises $\mathbf{s}_t$ to determine the optimal time for each node $v_i\in\mathbf{V}$ via the action set $\mathbf{a}_t=\{a_t^i\}_{i=1}^n$ (halt or wait). 'Wait' implies that further observation of recorded values is necessary, while 'Halt' implies that time $t$ is the optimal time $t_i^*$ for node $v_i$, and the corresponding forecasted value is recorded in $\mathbf{\widehat{X}}_T$.
Figure 3: Figure (a) shows the time series of nodes $v_i$ and $v_j$. The dotted line represents the series after the optimal time, and we only need to observe the solid line to make forecasting. The two solid dots in Figure (a) correspond to the two red circles in Figure (b). Take the red circle below as an example, it corresponds to the DTW distance between $x_{0:t}^i$ and $x_{0:t}^j$, which is calculated along the green grid path. We anticipate that node $i$ will acquire feature $x_{t_j^*}^j$ at time $t$ through MGCN, allowing it to make more precise forecasting earlier.
Figure 4: We assume node 1 (red) is the central node. The blue nodes represent spatially close nodes, while the green nodes represent temporally similar nodes. MGCN considers these two kinds of nodes, where spatial and temporal similarities are processed separately but in conjunction.

STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning

TL;DR

Abstract

STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)