Benchmarking AI-based data assimilation to advance data-driven global weather forecasting
Wuxin Wang, Weicheng Ni, Ben Fei, Tao Han, Lilan Huang, Taikang Yuan, Xiaoyong Li, Lei Bai, Boheng Duan, Kaijun Ren
TL;DR
DABench tackles the lack of real-world, objective benchmarks for AI-based data assimilation in global weather forecasting by unifying ERA5 reanalysis and GDAS prepbufr observations into a standardized, open benchmarking platform. It evaluates both deterministic and ensemble DA configurations and uses Pangu-Weather to assess the impact of AI-generated analyses on medium-range forecasts, with dual validation via ERA5 and independent radiosondes. Across a one-year DA cycle and a 10-day forecast horizon, AI-based DA methods—especially 4DVarFormer—show robustness and competitive performance relative to state-of-the-art AI-driven 4DVar frameworks, highlighting the potential for autonomous, data-driven global forecasting. The study also identifies limitations, such as the lack of satellite radiances and resolution constraints, and outlines future directions toward physics-informed AI, self-supervised learning, and hybrid AI-physics approaches to approach operational capabilities.
Abstract
Research on Artificial Intelligence (AI)-based Data Assimilation (DA) is expanding rapidly. However, the absence of an objective, comprehensive, and real-world benchmark hinders the fair comparison of diverse methods. Here, we introduce DABench, a benchmark designed for contributing to the development and evaluation of AI-based DA methods. By integrating real-world observations, DABench provides an objective and fair platform for validating long-term closed-loop DA cycles, supporting both deterministic and ensemble configurations. Furthermore, we assess the efficacy of AI-based DA in generating initial conditions for the advanced AI-based weather forecasting model to produce accurate medium-range global weather forecasting. Our dual-validation, utilizing both reanalysis data and independent radiosonde observations, demonstrates that AI-based DA achieves performance competitive with state-of-the-art AI-driven four-dimensional variational frameworks across both global weather DA and medium-range forecasting metrics. We invite the research community to utilize DABench to accelerate the advancement of AI-based DA for global weather forecasting.
