A Spatio-Temporal Online Robust Tensor Recovery Approach for Streaming Traffic Data Imputation
Yiyang Yang, Xiejian Chi, Shanxing Gao, Kaidong Wang, Yao Wang
TL;DR
This work tackles missing data and outliers in streaming ITS traffic data by formulating the problem as online robust tensor recovery and introducing STORTD, a Spatio-Temporal Online Robust Tucker Decomposition. STORTD jointly enforces global low‑rank spatio‑temporal structure via a Tucker model and local priors through spatial Laplacian regularization and temporal Toeplitz smoothness, updated incrementally as new data arrive. The authors provide a detailed online algorithm with efficient per‑step updates for outliers, spatial/temporal factors, and the core tensor, and demonstrate substantial accuracy gains and up to $10^3$× speedups over batch methods across three real datasets. The approach offers scalable, robust, real‑time traffic data imputation, with strong potential for ITS data quality enhancement.
Abstract
Data quality is critical to Intelligent Transportation Systems (ITS), as complete and accurate traffic data underpin reliable decision-making in traffic control and management. Recent advances in low-rank tensor recovery algorithms have shown strong potential in capturing the inherent structure of high-dimensional traffic data and restoring degraded observations. However, traditional batch-based methods demand substantial computational and storage resources, which limits their scalability in the face of continuously expanding traffic data volumes. Moreover, recent online tensor recovery methods often suffer from severe performance degradation in complex real-world scenarios due to their insufficient exploitation of the intrinsic structural properties of traffic data. To address these challenges, we reformulate the traffic data recovery problem within a streaming framework, and propose a novel online robust tensor recovery algorithm that simultaneously leverages both the global spatio-temporal correlations and local consistency of traffic data, achieving high recovery accuracy and significantly improved computational efficiency in large-scale scenarios. Our method is capable of simultaneously handling missing and anomalous values in traffic data, and demonstrates strong adaptability across diverse missing patterns. Experimental results on three real-world traffic datasets demonstrate that the proposed approach achieves high recovery accuracy while significantly improving computational efficiency by up to three orders of magnitude compared to state-of-the-art batch-based methods. These findings highlight the potential of the proposed approach as a scalable and effective solution for traffic data quality enhancement in ITS.
