AutoSTF: Decoupled Neural Architecture Search for Cost-Effective Automated Spatio-Temporal Forecasting

Tengfei Lyu; Weijia Zhang; Jinliang Deng; Hao Liu

AutoSTF: Decoupled Neural Architecture Search for Cost-Effective Automated Spatio-Temporal Forecasting

Tengfei Lyu, Weijia Zhang, Jinliang Deng, Hao Liu

TL;DR

AutoSTF addresses the high computational cost of automated spatio-temporal forecasting NAS by decoupling the search space into temporal and spatial domains, paired with representation compression and parameter sharing to reduce overhead. It introduces a multi-patch transfer module to capture multi-granularity temporal dependencies and enables layer-wise spatial adjacency search, enhancing modeling of fine-grained spatio-temporal patterns. The framework employs differentiable NAS with bi-level optimization to jointly learn architecture and network parameters, and experiments on eight datasets show substantial efficiency gains (up to 13.48x speed-up) while achieving state-of-the-art forecasting accuracy. Overall, AutoSTF demonstrates that decoupled search with patch-based temporal processing and flexible spatial operators yields practical, scalable improvements for smart-city spatio-temporal forecasting.

Abstract

Spatio-temporal forecasting is a critical component of various smart city applications, such as transportation optimization, energy management, and socio-economic analysis. Recently, several automated spatio-temporal forecasting methods have been proposed to automatically search the optimal neural network architecture for capturing complex spatio-temporal dependencies. However, the existing automated approaches suffer from expensive neural architecture search overhead, which hinders their practical use and the further exploration of diverse spatio-temporal operators in a finer granularity. In this paper, we propose AutoSTF, a decoupled automatic neural architecture search framework for cost-effective automated spatio-temporal forecasting. From the efficiency perspective, we first decouple the mixed search space into temporal space and spatial space and respectively devise representation compression and parameter-sharing schemes to mitigate the parameter explosion. The decoupled spatio-temporal search not only expedites the model optimization process but also leaves new room for more effective spatio-temporal dependency modeling. From the effectiveness perspective, we propose a multi-patch transfer module to jointly capture multi-granularity temporal dependencies and extend the spatial search space to enable finer-grained layer-wise spatial dependency search. Extensive experiments on eight datasets demonstrate the superiority of AutoSTF in terms of both accuracy and efficiency. Specifically, our proposed method achieves up to 13.48x speed-up compared to state-of-the-art automatic spatio-temporal forecasting methods while maintaining the best forecasting accuracy.

AutoSTF: Decoupled Neural Architecture Search for Cost-Effective Automated Spatio-Temporal Forecasting

TL;DR

Abstract

Paper Structure (31 sections, 16 equations, 13 figures, 13 tables)

This paper contains 31 sections, 16 equations, 13 figures, 13 tables.

Introduction
Preliminaries
The AutoSTF Framework
Embedding Layers
Temporal Search Module
Multi-patch Transfer Module
Spatial Search Module
Output Layer and Search Strategy
Experiments
Experimental Settings
Overall Results of AutoSTF
Efficiency Study
Ablation Study
Related Work
Conclusion
...and 16 more sections

Figures (13)

Figure 1: The training time and forecasting accuracy (mean squared error) comparison of manually-designed and automated spatio-temporal forecasting models on the METR-LA dataset. Our proposed method (AutoSTF) achieves the best forecasting accuracy while taking much less training time compared with all existing automated models.
Figure 2: AutoSTF framework. (a) shows the overview framework of AutoSTF. (b) depicts the Embedding Layers, which consist of raw time series embedding, node embedding, and time embedding. (c) illustrates the Temporal Search Module, which searches for the optimal temporal operator within the Temporal-DAG to model complex temporal dependencies. (d) describes the Multi-patch Transfer, which segments the embedding into several patches along the temporal feature axis and compresses each into a dense semantic representation. (e) presents the Spatial Search Module, tasked with searching for the optimal spatial operator and integrating it with the preceding temporal dependencies to uncover fine-grained spatial-temporal correlations. The two-headed arrow denotes the search operation. The single-headed arrow denotes the operator, and different colors represent different operators.
Figure 3: Multi-patch transfer module. The embedding $H_{\mathcal{T}}$ is the output of the Temporal Search Module. $P$ represents historical time steps segmented into $M$ patches.
Figure 4: The spatial search in a Spatial-DAG. $\mathcal{S}_*$ denote the latent representation. The search operation aims to identify an optimal spatial operator between any two nodes (e.g., $\mathcal{S}_1 \to \mathcal{S}_2$). Once identified, this operator transfers the latent representation from $\mathcal{S}_1$ to $\mathcal{S}_2$.
Figure 5: The search time of all the datasets.
...and 8 more figures

Theorems & Definitions (2)

definition 1: Spatial Graph
definition 2: Graph Signal Matrix

AutoSTF: Decoupled Neural Architecture Search for Cost-Effective Automated Spatio-Temporal Forecasting

TL;DR

Abstract

AutoSTF: Decoupled Neural Architecture Search for Cost-Effective Automated Spatio-Temporal Forecasting

Authors

TL;DR

Abstract

Table of Contents

Figures (13)

Theorems & Definitions (2)