Table of Contents
Fetching ...

FuXiWeather2: Learning accurate atmospheric state estimation for operational global weather forecasting

Xiaoze Xu, Xiuyu Sun, Songling Zhu, Xiaohui Zhong, Yuanqing Huang, Zijian Zhu, Jun Liu, Hao Li

Abstract

Numerical weather prediction has long been constrained by the computational bottlenecks inherent in data assimilation and numerical modeling. While machine learning has accelerated forecasting, existing models largely serve as "emulators of reanalysis products," thereby retaining their systematic biases and operational latencies. Here, we present FuXiWeather2, a unified end-to-end neural framework for assimilation and forecasting. We align training objectives directly with a combination of real-world observations and reanalysis data, enabling the framework to effectively rectify inherent errors within reanalysis products. To address the distribution shift between NWP-derived background inputs during training and self-generated backgrounds during deployment, we introduce a recursive unrolling training method to enhance the precision and stability of analysis generation. Furthermore, our model is trained on a hybrid dataset of raw and simulated observations to mitigate the impact of observational distribution inconsistency. FuXiWeather2 generates high-resolution ($0.25^{\circ}$) global analysis fields and 10-day forecasts within minutes. The analysis fields surpass the NCEP-GFS across most variables and demonstrate superior accuracy over both ERA5 and the ECMWF-HRES system in lower-tropospheric and surface variables. These high-quality analysis fields drive deterministic forecasts that exceed the skill of the HRES system in 91\% of evaluated metrics. Additionally, its outstanding performance in typhoon track prediction underscores its practical value for rapid response to extreme weather events. The FuXiWeather2 analysis dataset is available at https://doi.org/10.5281/zenodo.18872728.

FuXiWeather2: Learning accurate atmospheric state estimation for operational global weather forecasting

Abstract

Numerical weather prediction has long been constrained by the computational bottlenecks inherent in data assimilation and numerical modeling. While machine learning has accelerated forecasting, existing models largely serve as "emulators of reanalysis products," thereby retaining their systematic biases and operational latencies. Here, we present FuXiWeather2, a unified end-to-end neural framework for assimilation and forecasting. We align training objectives directly with a combination of real-world observations and reanalysis data, enabling the framework to effectively rectify inherent errors within reanalysis products. To address the distribution shift between NWP-derived background inputs during training and self-generated backgrounds during deployment, we introduce a recursive unrolling training method to enhance the precision and stability of analysis generation. Furthermore, our model is trained on a hybrid dataset of raw and simulated observations to mitigate the impact of observational distribution inconsistency. FuXiWeather2 generates high-resolution () global analysis fields and 10-day forecasts within minutes. The analysis fields surpass the NCEP-GFS across most variables and demonstrate superior accuracy over both ERA5 and the ECMWF-HRES system in lower-tropospheric and surface variables. These high-quality analysis fields drive deterministic forecasts that exceed the skill of the HRES system in 91\% of evaluated metrics. Additionally, its outstanding performance in typhoon track prediction underscores its practical value for rapid response to extreme weather events. The FuXiWeather2 analysis dataset is available at https://doi.org/10.5281/zenodo.18872728.
Paper Structure (25 sections, 23 equations, 26 figures, 2 tables)

This paper contains 25 sections, 23 equations, 26 figures, 2 tables.

Figures (26)

  • Figure 1: Overview of the FuXiWeather2 system. (A) Input observations. The system ingests multi-source data, including remote sensing and in-situ measurements. A hybrid dataset, combining real-world observations and physical simulations, is utilized during training, while only real-world observations are used during inference. (B) Model architectures and cycling mode. The system consists of a multi-branch U-Net for data assimilation (DA) and a Swin U-Transformer for forec asting. The DA module integrates observations with short-range background forecast to generate analysis, which then drive the forecast module to produce the next-step background. Both modules operate in an interleaved cycling manner, continuously generating stable analysis fields. The entire process is optimized via a recursive unrolled end-to-end training paradigm. (C) Supervision signals. During training, FuXiWeather2 employs dual supervision, leveraging both global reanalysis products and discrete in-situ observations. (D) Medium-range autoregressive forecasting. Initialized by the analysis, the forecast module generates 10-day global forecasts through an autoregressive approach.
  • Figure 2: Temperature (A) and humidity (B) Jacobian functions for the selected 38 IASI channels. A total of 38 channels are included, consisting of 21 channels in the $CO_2$ absorption band (650-770$cm^{-1}$, red lines), 16 channels in the water vapor absorption band (1210-2020$cm^{-1}$, green lines), and 1 atmospheric window channel (943.25$cm^{-1}$, blue line).
  • Figure 3: Loss weights for observational supervision. (A–E) Loss weights for upper-air variables across 13 pressure levels. (F) Loss weights for surface variables. For surface variables, a uniform weight of 0.2 is assigned. For upper-air variables, considering the land–atmosphere interaction, the weights decrease with increasing altitude; specifically, the weights decay linearly with height and remain constant at 0.02 above the 500 hPa level.The maximum weights for relative humidity ($R$), U-wind component ($U$), V-wind component ($V$), and geopotential ($Z$) are set to 0.1, whereas temperature ($T$) is assigned a higher maximum weight of 0.2. This higher weighting for $T$ accounts for its superior spatial continuity, which enhances the spatial representativeness
  • Figure 4: Schematic of the pipeline parallelism strategy for recursive unrolled training. (A) Partitioning of the model across GPUs for assimilation cycles. (B) Data flow across modules and GPUs during forward propagation. FuXiWeather2 distributes the tightly coupled assimilation and forecasting modules across four GPUs. GPU1 and GPU2 serve as assimilation devices, while GPU3 and GPU4 act as forecasting devices. Each module is further divided into two sequential blocks. The $\times 4$ symbol denotes the continuous execution of four assimilation cycles, where the forecast output cyclically feeds back into the assimilation module.
  • Figure 5: Observation-based verification for global analysis. (A–J) Time-averaged normalized root mean square error (RMSE) differences (A-E) and RMSE time series (F-J) of 12-hour forecasts using in-situ observations as ground truth. (K, L) Time-averaged normalized RMSE differences (K) and RMSE time series (L) of 12-hour forecasts using COSMIC-2 refractivity data as ground truth. (M, N) Time-averaged normalized RMSE differences (M) and RMSE time series (N) of analysis fields using YunYao refractivity data as ground truth. Gray, blue, green, and red lines represent GFS, HRES, ERA5, and FuXiWeather2, respectively. The evaluation spans a one-year testing period at 00:00 and 12:00 UTC. In panels L and N, dash-dotted, dashed, and solid lines denote the 850, 500, and 300 hPa pressure levels, respectively. In panels A-E, K and M, 13 pressure levels are displayed, with shaded areas represent the three times the 95% confidence intervals of the t-test. Evaluation is conducted on 12-hour forecast fields (A-L) for observations already assimilated (radiosonde, land station, marine platform, and COSMIC-2 GNSS-RO) to ensure independence. For YunYao GNSS-RO data, which remain unassimilated by all products, the evaluation is performed directly on the (re)analysis fields (M and N).
  • ...and 21 more figures