Learning from History: A Retrieval-Augmented Framework for Spatiotemporal Prediction
Hao Jia, Penghao Zhao, Hao Wu, Yuan Gao, Yangyu Tao, Bin Cui
TL;DR
The paper tackles the difficulty of long-term, high-fidelity spatiotemporal forecasting in complex physical systems, where purely parametric deep learning models accumulate error and violate physical realism. It introduces Retrieval-Augmented Prediction (RAP), a three-stage framework (Retrieve, Augment, Predict) that uses historical analogs as non-parametric dynamic guidance by feeding the retrieved true future $oldsymbol{Y}_{\text{ref}}$ into a dual-stream network alongside the current state $oldsymbol{X}_{\text{query}}$. Unlike hard constraints, $oldsymbol{Y}_{\text{ref}}$ serves as a conditional input that regularizes learning, with a loss that combines $\mathcal{L}_1$ and $\mathcal{L}_{\text{MSE}}$ but excludes the reference from the loss to avoid trivial copying. The authors validate RAP across ERA5 weather forecasting, 2D turbulence, and fire-spread simulations, showing consistent improvements over diverse baselines and enhanced physical fidelity, including sharper vortices and flame fronts in long-term rollouts. They also demonstrate robustness and scalability, including data-efficient benefits for large models and ablations that underscore the importance of the dual-stream integration and the role of history-derived guidance for stable, physically plausible predictions.
Abstract
Accurate and long-term spatiotemporal prediction for complex physical systems remains a fundamental challenge in scientific computing. While deep learning models, as powerful parametric approximators, have shown remarkable success, they suffer from a critical limitation: the accumulation of errors during long-term autoregressive rollouts often leads to physically implausible artifacts. This deficiency arises from their purely parametric nature, which struggles to capture the full constraints of a system's intrinsic dynamics. To address this, we introduce a novel \textbf{Retrieval-Augmented Prediction (RAP)} framework, a hybrid paradigm that synergizes the predictive power of deep networks with the grounded truth of historical data. The core philosophy of RAP is to leverage historical evolutionary exemplars as a non-parametric estimate of the system's local dynamics. For any given state, RAP efficiently retrieves the most similar historical analog from a large-scale database. The true future evolution of this analog then serves as a \textbf{reference target}. Critically, this target is not a hard constraint in the loss function but rather a powerful conditional input to a specialized dual-stream architecture. It provides strong \textbf{dynamic guidance}, steering the model's predictions towards physically viable trajectories. In extensive benchmarks across meteorology, turbulence, and fire simulation, RAP not only surpasses state-of-the-art methods but also significantly outperforms a strong \textbf{analog-only forecasting baseline}. More importantly, RAP generates predictions that are more physically realistic by effectively suppressing error divergence in long-term rollouts.
