Weaver: Kronecker Product Approximations of Spatiotemporal Attention for Traffic Network Forecasting
Christopher Cheong, Gary Davis, Seongjin Choi
TL;DR
Weaver introduces a scalable spatiotemporal forecasting framework for traffic networks by decomposing full spatiotemporal attention via Kronecker product approximations, enabling efficient P2-KMV message passing on a Kronecker- TEN representation. It couples local signed spatial/temporal attention with a Traffic Phase Dictionary for self-conditioning and uses a Continuous Tanimoto Coefficient to model negative traffic interactions stably. The approach achieves competitive accuracy on PEMS-BAY and METR-LA, while delivering strong training efficiency and robustness under missing data. Ablations show the Kronecker attention, valence attention, and phase dictionary each contribute to stability and performance, particularly at longer horizons. The work provides a principled, physics-inspired, graph-based perspective with potential extensions to transferability, geometry-aware retrieval, and physics-informed modeling.
Abstract
Spatiotemporal forecasting on transportation networks is a complex task that requires understanding how traffic nodes interact within a dynamic, evolving system dictated by traffic flow dynamics and social behavioral patterns. The importance of transportation networks and ITS for modern mobility and commerce necessitates forecasting models that are not only accurate but also interpretable, efficient, and robust under structural or temporal perturbations. Recent approaches, particularly Transformer-based architectures, have improved predictive performance but often at the cost of high computational overhead and diminished architectural interpretability. In this work, we introduce Weaver, a novel attention-based model that applies Kronecker product approximations (KPA) to decompose the PN X PN spatiotemporal attention of O(P^2N^2) complexity into local P X P temporal and N X N spatial attention maps. This Kronecker attention map enables our Parallel-Kronecker Matrix-Vector product (P2-KMV) for efficient spatiotemporal message passing with O(P^2N + N^2P) complexity. To capture real-world traffic dynamics, we address the importance of negative edges in modeling traffic behavior by introducing Valence Attention using the continuous Tanimoto coefficient (CTC), which provides properties conducive to precise latent graph generation and training stability. To fully utilize the model's learning capacity, we introduce the Traffic Phase Dictionary for self-conditioning. Evaluations on PEMS-BAY and METR-LA show that Weaver achieves competitive performance across model categories while training more efficiently.
