Enhanced Spatiotemporal Prediction Using Physical-guided And Frequency-enhanced Recurrent Neural Networks
Xuanle Zhao, Yue Sun, Tielin Zhang, Bo Xu
TL;DR
This work addresses the challenge of accurate spatiotemporal prediction under data constraints by introducing a physical-guided neural network that combines a frequency-enhanced Fourier pathway, a moment loss, and an adaptive PDE-guided Runge-Kutta updater. The model fuses Transformer-based spatial corrections with Fourier-based physical representations, and updates latent states with an adaptive RK2 scheme to enforce PDE-consistent dynamics. Across diverse spatiotemporal and video benchmarks, it achieves state-of-the-art or competitive performance while using significantly fewer parameters, demonstrating the practical value of physics-informed design in dynamic forecasting.
Abstract
Spatiotemporal prediction plays an important role in solving natural problems and processing video frames, especially in weather forecasting and human action recognition. Recent advances attempt to incorporate prior physical knowledge into the deep learning framework to estimate the unknown governing partial differential equations (PDEs), which have shown promising results in spatiotemporal prediction tasks. However, previous approaches only restrict neural network architectures or loss functions to acquire physical or PDE features, which decreases the representative capacity of a neural network. Meanwhile, the updating process of the physical state cannot be effectively estimated. To solve the above mentioned problems, this paper proposes a physical-guided neural network, which utilizes the frequency-enhanced Fourier module and moment loss to strengthen the model's ability to estimate the spatiotemporal dynamics. Furthermore, we propose an adaptive second-order Runge-Kutta method with physical constraints to model the physical states more precisely. We evaluate our model on both spatiotemporal and video prediction tasks. The experimental results show that our model outperforms state-of-the-art methods and performs best in several datasets, with a much smaller parameter count.
