Decomposing weather forecasting into advection and convection with neural networks
Mengxuan Chen, Ziqi Yuan, Jinxiao Zhang, Runmin Dong, Haohuan Fu
TL;DR
This work decomposes weather forecasting into two neural components: a Graph Attention Network (GAT) to model horizontal advection and a Multi-Layer Perceptron (MLP) to represent vertical convection, enabling a lightweight, modular approach to global prediction. Trained on WeatherBench ERA5 data at 5.625° with an autoregressive 6-hour step scheme, the GAT-MLP architecture achieves competitive accuracy with a small parameter count (about 4.38M) and outperforms several data-driven baselines in one-step and 5-day iterative forecasts. Ablation studies and edge analyses demonstrate that separating dynamics and physics, incorporating time/space embeddings, and using a localized stencil-based graph improve robustness and fidelity, particularly for diurnal cycles in near-surface temperatures. The results suggest a promising direction for efficient, interpretable ML-assisted weather forecasting, with potential extensions to higher resolution and physics-consistent designs.
Abstract
Operational weather forecasting models have advanced for decades on both the explicit numerical solvers and the empirical physical parameterization schemes. However, the involved high computational costs and uncertainties in these existing schemes are requiring potential improvements through alternative machine learning methods. Previous works use a unified model to learn the dynamics and physics of the atmospheric model. Contrarily, we propose a simple yet effective machine learning model that learns the horizontal movement in the dynamical core and vertical movement in the physical parameterization separately. By replacing the advection with a graph attention network and the convection with a multi-layer perceptron, our model provides a new and efficient perspective to simulate the transition of variables in atmospheric models. We also assess the model's performance over a 5-day iterative forecasting. Under the same input variables and training methods, our model outperforms existing data-driven methods with a significantly-reduced number of parameters with a resolution of 5.625 deg. Overall, this work aims to contribute to the ongoing efforts that leverage machine learning techniques for improving both the accuracy and efficiency of global weather forecasting.
