ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation
Tong Nie, Guoyang Qin, Wei Ma, Yuewen Mei, Jian Sun
TL;DR
ImputeFormer tackles spatiotemporal missing data by injecting a low-rank inductive bias into a Transformer framework. It introduces temporal projected attention and embedded spatial attention, plus a Fourier Imputation Loss to regularize spectrum, yielding linear-time complexity in practice. Across traffic, energy, solar, and air quality datasets, it achieves state-of-the-art accuracy with superior efficiency and robustness, while offering interpretable mechanisms through spectrum and embedding analyses. This approach promises broad applicability to real-world imputation tasks and time-series representation learning, especially under highly sparse or cross-domain conditions.
Abstract
Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly include low-rank models and deep learning models. The former assumes general structural priors but has limited model capacity. The latter possesses salient features of expressivity but lacks prior knowledge of the underlying spatiotemporal structures. Leveraging the strengths of both two paradigms, we demonstrate a low rankness-induced Transformer to achieve a balance between strong inductive bias and high model expressivity. The exploitation of the inherent structures of spatiotemporal data enables our model to learn balanced signal-noise representations, making it generalizable for a variety of imputation problems. We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters, and air quality. Promising empirical results provide strong conviction that incorporating time series primitives, such as low-rankness, can substantially facilitate the development of a generalizable model to approach a wide range of spatiotemporal imputation problems.
