GMG: A Video Prediction Method Based on Global Focus and Motion Guided
Yuhao Du, Hui Liu, Haoxiang Peng, Xinyuan Cheng, Chenrong Wu, Jiankai Zhang
TL;DR
GMG tackles two core challenges in spatiotemporal weather prediction: long-range correlations and non-rigid motion deformation. It introduces a Global Focus Module to enlarge the receptive field and a Motion Guided Module to model deformation with learnable balance and decay dynamics, integrated within a four-unit GMG framework built on ST-ConvLSTM and Self-Attention Memory. Across five diverse datasets, GMG achieves state-of-the-art or competitive performance on standard metrics, demonstrating strong generalization to both rainfall and traffic-like data as well as moving MNIST. The results suggest GMG provides a general, effective approach for complex video prediction tasks with practical implications for meteorological forecasting and related domains.
Abstract
Recent years, weather forecasting has gained significant attention. However, accurately predicting weather remains a challenge due to the rapid variability of meteorological data and potential teleconnections. Current spatiotemporal forecasting models primarily rely on convolution operations or sliding windows for feature extraction. These methods are limited by the size of the convolutional kernel or sliding window, making it difficult to capture and identify potential teleconnection features in meteorological data. Additionally, weather data often involve non-rigid bodies, whose motion processes are accompanied by unpredictable deformations, further complicating the forecasting task. In this paper, we propose the GMG model to address these two core challenges. The Global Focus Module, a key component of our model, enhances the global receptive field, while the Motion Guided Module adapts to the growth or dissipation processes of non-rigid bodies. Through extensive evaluations, our method demonstrates competitive performance across various complex tasks, providing a novel approach to improving the predictive accuracy of complex spatiotemporal data.
