Table of Contents
Fetching ...

Towards an end-to-end artificial intelligence driven global weather forecasting system

Kun Chen, Lei Bai, Fenghua Ling, Peng Ye, Tao Chen, Hang Fan, Hao Chen, Yi Xiao, Kang Chen, Tao Han, Jing-Jia Luo, Wanli Ouyang

TL;DR

The paper presents Adas, an AI-driven data assimilation model that learns the steady-state background error covariance and uses a confidence matrix to weigh observations, enabling an end-to-end forecasting pipeline with FengWu (FengWu-Adas). Adas integrates with FengWu in a cyclic training regime to produce stable analyses and long-horizon forecasts using conventional observations, achieving performance competitive with IFS in real-world and idealized tests. Core innovations include patch-based latent representations, a UNet-like multi-scale architecture, and attention/convolutional mechanisms (gated convolution and gated cross-attention) guided by observation quality. The results demonstrate robust, fast inference and potential for data-driven end-to-end weather forecasting, while acknowledging current limitations such as satellite data integration and off-grid modeling for future improvement.

Abstract

The weather forecasting system is important for science and society, and significant achievements have been made in applying artificial intelligence (AI) to medium-range weather forecasting. However, existing AI-based weather forecasting models rely on analysis or reanalysis products from traditional numerical weather prediction (NWP) systems as initial conditions for making predictions. The initial states are typically generated by traditional data assimilation components, which are computationally expensive and time-consuming. Here, by cyclic training to model the steady-state background error covariance and introducing the confidence matrix to characterize the quality of observations, we present an AI-based data assimilation model, i.e., Adas, for global weather variables. Further, we combine Adas with the advanced AI-based forecasting model (i.e., FengWu) to construct an end-to-end AI-based global weather forecasting system: FengWu-Adas. We demonstrate that Adas can assimilate global conventional observations to produce high-quality analysis, enabling the system to operate stably for long term. Moreover, the system can generate accurate end-to-end weather forecasts with comparable skill to those of the IFS, demonstrating the promising potential of data-driven approaches.

Towards an end-to-end artificial intelligence driven global weather forecasting system

TL;DR

The paper presents Adas, an AI-driven data assimilation model that learns the steady-state background error covariance and uses a confidence matrix to weigh observations, enabling an end-to-end forecasting pipeline with FengWu (FengWu-Adas). Adas integrates with FengWu in a cyclic training regime to produce stable analyses and long-horizon forecasts using conventional observations, achieving performance competitive with IFS in real-world and idealized tests. Core innovations include patch-based latent representations, a UNet-like multi-scale architecture, and attention/convolutional mechanisms (gated convolution and gated cross-attention) guided by observation quality. The results demonstrate robust, fast inference and potential for data-driven end-to-end weather forecasting, while acknowledging current limitations such as satellite data integration and off-grid modeling for future improvement.

Abstract

The weather forecasting system is important for science and society, and significant achievements have been made in applying artificial intelligence (AI) to medium-range weather forecasting. However, existing AI-based weather forecasting models rely on analysis or reanalysis products from traditional numerical weather prediction (NWP) systems as initial conditions for making predictions. The initial states are typically generated by traditional data assimilation components, which are computationally expensive and time-consuming. Here, by cyclic training to model the steady-state background error covariance and introducing the confidence matrix to characterize the quality of observations, we present an AI-based data assimilation model, i.e., Adas, for global weather variables. Further, we combine Adas with the advanced AI-based forecasting model (i.e., FengWu) to construct an end-to-end AI-based global weather forecasting system: FengWu-Adas. We demonstrate that Adas can assimilate global conventional observations to produce high-quality analysis, enabling the system to operate stably for long term. Moreover, the system can generate accurate end-to-end weather forecasts with comparable skill to those of the IFS, demonstrating the promising potential of data-driven approaches.
Paper Structure (5 sections, 6 equations, 8 figures, 1 table)

This paper contains 5 sections, 6 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: The progression of global weather forecasting system. Traditional NWP systems consist of physical weather forecasting model and data assimilation. The breakthrough of AI-based medium-range weather forecasting models occurred in 2022-2023 with highly competitive performance in terms of accuracy, but they still rely on the NWP systems for making predictions. Our work is dedicated to exploring the possibility of an end-to-end global weather forecasting system which is driven purely by AI.
  • Figure 2: The framework of FengWu-Adas and network architecture of Adas.a, The framework and process of FengWu-Adas. The prediction of FengWu provides the background for data assimilation and the analysis of Adas serves as the initial state for weather forecasting. The parameters of FengWu are frozen during training and not further fine-tuned or involved in joint optimization. b, Overview of Adas's network structure. The dual encoder composed of gated interaction blocks extract multi-scale features of the background and sparse observations separately and capture the interactions between them, and then the features are fused and decoded. c, Details of dual gated interaction block. The gated convolution and gated cross-attention modules are designed for sparse observations, which are both guided by the confidence matrix. d, Simple schematic diagram of patch merging and patch expanding. The patch merging performs the rearrangement operation and linear layer to achieve down-sampling and the patch expanding performs the opposite operation to achieve up-sampling.
  • Figure 3: The performance and properties of FengWu-Adas in ideal experiments. The RMSE (lower is better) is calculated against ERA5. a, RMSE variations of the analyses for $z500$ and $t850$ variables with different observation ratios in a whole year. The forecast with a 5-day lead time is used as the initial background to start the system. As the system continuously executes the cyclic forecast-assimilation, the RMSE of the analysis decreases rapidly and converges to a lower level. b, The performance differences between cyclic training and single-step training. The cyclic training directly models the background error covariance at steady state, circumventing the issue of error accumulation present in single-step training during cyclic forecast-assimilation experiments and achieving lower steady-state errors. c, The performance when starting the system with completely random Gaussian noise as the initial background. The system can still quickly recover to normal steady-state levels, showing outstanding stability and generalization capability to assimilate observations on arbitrary background. d, The power spectra of the 10-day lead background and analysis for $z500$ variable. After data assimilation, Adas can significantly correct the spectral degradation problem of FengWu forecasts with long lead time at small and medium scales. e, The sensitivity of the system to the noises with varying magnitudes. The horizontal axes correspond to different random Gaussian noise proportions of 0, 0.1%, 0.5%, 1%, 2.5%, 5%, and 10%, respectively. It demonstrates the robustness and anti-interference capability of Adas in the presence of noise.
  • Figure 4: The performance of FengWu-Adas to produce analysis with GDAS real observations.a, RMSE variations of the analyses for $z500$ and $t850$ variables evaluated on ERA5 and the comparison with 3DVar algorithm. By alternating between forecasting and data assimilation, the system can also maintain long-term stability in real-world scenarios. b, RMSE variations of the analyses evaluated on 200 reserved GDAS observation columns and comparison with IFS-Analysis and ERA5. The analysis quality of Adas evaluated at the stations shows a comparable level to IFS-Analysis and ERA5, although Adas only assimilates conventional observations. c, The box plot of the RMSE across the evaluation stations at all times. Whether considering the median, IQR or outliers, Adas shows reasonable results and is not significantly worse than IFS-Analysis and ERA5. d, The scatter plot and corresponding fits between the observation and analysis values across the evaluation stations at all times.
  • Figure 5: Visualization of the fields during data assimilation and the RMSE distribution evaluated on the reserved stations.a, Visualization of $z500$ variable and corresponding error distributions with ERA5 during data assimilation. The date-time is randomly selected at 2017-01-27 00:00 UTC and the background uses the $96h$ forecast based on ERA5. b, Visualization of $t850$ variable and error distributions during data assimilation. c, Visualization of the RMSE distribution at the reserved stations and comparison with IFS-Analysis and ERA5. The date-time is also selected at 2017-01-27 00:00 UTC, and the analysis of Adas shows similar errors to IFS-Analysis and ERA5.
  • ...and 3 more figures