Table of Contents
Fetching ...

Decomposing weather forecasting into advection and convection with neural networks

Mengxuan Chen, Ziqi Yuan, Jinxiao Zhang, Runmin Dong, Haohuan Fu

TL;DR

This work decomposes weather forecasting into two neural components: a Graph Attention Network (GAT) to model horizontal advection and a Multi-Layer Perceptron (MLP) to represent vertical convection, enabling a lightweight, modular approach to global prediction. Trained on WeatherBench ERA5 data at 5.625° with an autoregressive 6-hour step scheme, the GAT-MLP architecture achieves competitive accuracy with a small parameter count (about 4.38M) and outperforms several data-driven baselines in one-step and 5-day iterative forecasts. Ablation studies and edge analyses demonstrate that separating dynamics and physics, incorporating time/space embeddings, and using a localized stencil-based graph improve robustness and fidelity, particularly for diurnal cycles in near-surface temperatures. The results suggest a promising direction for efficient, interpretable ML-assisted weather forecasting, with potential extensions to higher resolution and physics-consistent designs.

Abstract

Operational weather forecasting models have advanced for decades on both the explicit numerical solvers and the empirical physical parameterization schemes. However, the involved high computational costs and uncertainties in these existing schemes are requiring potential improvements through alternative machine learning methods. Previous works use a unified model to learn the dynamics and physics of the atmospheric model. Contrarily, we propose a simple yet effective machine learning model that learns the horizontal movement in the dynamical core and vertical movement in the physical parameterization separately. By replacing the advection with a graph attention network and the convection with a multi-layer perceptron, our model provides a new and efficient perspective to simulate the transition of variables in atmospheric models. We also assess the model's performance over a 5-day iterative forecasting. Under the same input variables and training methods, our model outperforms existing data-driven methods with a significantly-reduced number of parameters with a resolution of 5.625 deg. Overall, this work aims to contribute to the ongoing efforts that leverage machine learning techniques for improving both the accuracy and efficiency of global weather forecasting.

Decomposing weather forecasting into advection and convection with neural networks

TL;DR

This work decomposes weather forecasting into two neural components: a Graph Attention Network (GAT) to model horizontal advection and a Multi-Layer Perceptron (MLP) to represent vertical convection, enabling a lightweight, modular approach to global prediction. Trained on WeatherBench ERA5 data at 5.625° with an autoregressive 6-hour step scheme, the GAT-MLP architecture achieves competitive accuracy with a small parameter count (about 4.38M) and outperforms several data-driven baselines in one-step and 5-day iterative forecasts. Ablation studies and edge analyses demonstrate that separating dynamics and physics, incorporating time/space embeddings, and using a localized stencil-based graph improve robustness and fidelity, particularly for diurnal cycles in near-surface temperatures. The results suggest a promising direction for efficient, interpretable ML-assisted weather forecasting, with potential extensions to higher resolution and physics-consistent designs.

Abstract

Operational weather forecasting models have advanced for decades on both the explicit numerical solvers and the empirical physical parameterization schemes. However, the involved high computational costs and uncertainties in these existing schemes are requiring potential improvements through alternative machine learning methods. Previous works use a unified model to learn the dynamics and physics of the atmospheric model. Contrarily, we propose a simple yet effective machine learning model that learns the horizontal movement in the dynamical core and vertical movement in the physical parameterization separately. By replacing the advection with a graph attention network and the convection with a multi-layer perceptron, our model provides a new and efficient perspective to simulate the transition of variables in atmospheric models. We also assess the model's performance over a 5-day iterative forecasting. Under the same input variables and training methods, our model outperforms existing data-driven methods with a significantly-reduced number of parameters with a resolution of 5.625 deg. Overall, this work aims to contribute to the ongoing efforts that leverage machine learning techniques for improving both the accuracy and efficiency of global weather forecasting.
Paper Structure (31 sections, 9 equations, 9 figures, 7 tables)

This paper contains 31 sections, 9 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: The combination of dynamics and physics forms the comprehensive atmosphere model. This study proposes a modularized weather forecasting model combining Graph Attention Network and Multi-Layer Perceptron (GAT-MLP) to simulate the advection and convection in the atmospheric model respectively, in an iterative forecasting manner.
  • Figure 2: A general view of our proposed model. Variables from the previous three timesteps and the constants are input variables. Each grid is viewed as a node and connected with four neighbors. Then, the whole graph is put into a 2-layer Graph Attention Network (GAT), followed by a 4-layer Multi-Layer Perceptron (MLP) to predict the variables in the next timestep. This model adopts the auto-regressive manner for prediction, which means, the predicted variables are fed back into the model for iterative forecasting.
  • Figure 3: An example of the visualization of global forecasting results with 30 hours lead time of Z500, T850, and T2M. For each variable, the first row shows the predicted results by our model, and the second row shows the ground truth from WeatherBench dataset.
  • Figure 4: Ablation study results for 5-day iterative forecasting.
  • Figure 5: Validation of the iterative training strategy. W and W/O refer to with and without iterative training strategy.
  • ...and 4 more figures