Spatio-temporal Multivariate Time Series Forecast with Chosen Variables
Zibo Liu, Zhe Jiang, Zelin Xu, Tingsong Xiao, Yupu Zhang, Zhengkun Xiao, Haibo Wang, Shigang Chen
TL;DR
This work defines Spatio-Temporal Multivariate Forecast with Chosen Variables (STCV), addressing the practical constraint of deploying only $m$ permanent sensors among $n$ locations while forecasting all locations. The authors propose VIP (Variable-Parameter Iterative Pruning), a framework that jointly optimizes input variable selection and model sparsity through masked variable-parameter pruning, dynamic extrapolation across all variables, and a prioritized replay mechanism to mitigate forgetting. They provide a formal problem definition, complexity analyses, and a detailed architecture that starts from a base STMF model and progressively prunes both inputs and attentional parameters. Experiments on five real-world datasets demonstrate significant gains in forecast accuracy and efficiency, showing how optimized sensor deployment can enhance predictive performance under budget constraints and scale to large spatio-temporal systems.
Abstract
Spatio-Temporal Multivariate time series Forecast (STMF) uses the time series of $n$ spatially distributed variables in a period of recent past to forecast their values in a period of near future. It has important applications in spatio-temporal sensing forecast such as road traffic prediction and air pollution prediction. Recent papers have addressed a practical problem of missing variables in the model input, which arises in the sensing applications where the number $m$ of sensors is far less than the number $n$ of locations to be monitored, due to budget constraints. We observe that the state of the art assumes that the $m$ variables (i.e., locations with sensors) in the model input are pre-determined and the important problem of how to choose the $m$ variables in the input has never been studied. This paper fills the gap by studying a new problem of STMF with chosen variables, which optimally selects $m$-out-of-$n$ variables for the model input in order to maximize the forecast accuracy. We propose a unified framework that jointly performs variable selection and model optimization for both forecast accuracy and model efficiency. It consists of three novel technical components: (1) masked variable-parameter pruning, which progressively prunes less informative variables and attention parameters through quantile-based masking; (2) prioritized variable-parameter replay, which replays low-loss past samples to preserve learned knowledge for model stability; (3) dynamic extrapolation mechanism, which propagates information from variables selected for the input to all other variables via learnable spatial embeddings and adjacency information. Experiments on five real-world datasets show that our work significantly outperforms the state-of-the-art baselines in both accuracy and efficiency, demonstrating the effectiveness of joint variable selection and model optimization.
