DiffDA: a Diffusion Model for Weather-scale Data Assimilation

Langwen Huang; Lukas Gianinazzi; Yuejiang Yu; Peter D. Dueben; Torsten Hoefler

DiffDA: a Diffusion Model for Weather-scale Data Assimilation

Langwen Huang, Lukas Gianinazzi, Yuejiang Yu, Peter D. Dueben, Torsten Hoefler

TL;DR

DiffDA introduces a diffusion-based data assimilation framework for weather-scale, 0.25° global fields by leveraging a pretrained GraphCast backbone and inference-time conditioning on predicted states and sparse observations. The method uses a soft-masked, interpolation-informed conditioning strategy to handle irregular observations and demonstrates that assimilated fields can approach ERA5-quality fields with far less data and computational cost. Across single-step, autoregressive, and forecast-with-assimilated-data experiments, DiffDA achieves competitive RMSEs and preserves lead times within about 24 hours for multiple variables, enabling autoregressive reanalysis. This work provides a practical, scalable path toward high-resolution ML-based reanalysis and forecast systems with reduced computational demands.

Abstract

The generation of initial conditions via accurate data assimilation is crucial for weather forecasting and climate modeling. We propose DiffDA as a denoising diffusion model capable of assimilating atmospheric variables using predicted states and sparse observations. Acknowledging the similarity between a weather forecast model and a denoising diffusion model dedicated to weather applications, we adapt the pretrained GraphCast neural network as the backbone of the diffusion model. Through experiments based on simulated observations from the ERA5 reanalysis dataset, our method can produce assimilated global atmospheric data consistent with observations at 0.25 deg (~30km) resolution globally. This marks the highest resolution achieved by ML data assimilation models. The experiments also show that the initial conditions assimilated from sparse observations (less than 0.96% of gridded data) and 48-hour forecast can be used for forecast models with a loss of lead time of at most 24 hours compared to initial conditions from state-of-the-art data assimilation in ERA5. This enables the application of the method to real-world applications, such as creating reanalysis datasets with autoregressive data assimilation.

DiffDA: a Diffusion Model for Weather-scale Data Assimilation

TL;DR

Abstract

Paper Structure (24 sections, 5 equations, 23 figures, 2 tables, 1 algorithm)

This paper contains 24 sections, 5 equations, 23 figures, 2 tables, 1 algorithm.

Introduction
Method
Problem Formulation
Denoising Diffusion Probabilistic Model
Conditioning for Predicted State
Conditioning for Sparse Observations
Selection of Diffusion Model Structure
Experiments
Implementation
Training Data
Treatment of Conditioning for Sparse Observations
Experiment Settings and Results
Single-step Data Assimilation
Autoregressive Data Assimilation
Forecast on Single-step Assimilated Data
...and 9 more sections

Figures (23)

Figure 1: Diagram of a numerical weather forecasting pipeline. It consists of data assimilation, simulation and post-processing. Data assimilation produces gridded values from sparse observations and predicted gridded values from previous time steps. Simulation takes in gridded values and produces predictions in gridded values at future time steps. Post-processing improves prediction so that it is closer to future observations.
Figure 2: Architecture of the diffusion-based data assimilation method. We take advantage of the input and output shape of the pretrained GraphCast model, which takes the state of the atmosphere at two time steps as input. In each iteration of the denoising diffusion process, the adapted GraphCast model takes the predicted state and the assimilated state with noise, and further denoises the assimilated state. To enforce the observation values at inference time, The denoised state is merged with interpolated observations using a soft mask created by softbleeding the hard mask derived from the original observations.
Figure 3: Creating a soft mask from a hard mask using softbleed. Softbleed performs a ($\max$,$\times$)-convolution over the Gaussian kernel and the hard mask.
Figure 4: Overview of the experiment settings. Single-step data assimilation takes in observations and a 48h forecast, and outputs assimilated data at 0h. Autoregressive data assimilation combines a data assimilation model and a 6h prediction model to produce assimilated data every 6h autoregressively. It is also of interest to perform 48h forecasts on single-step assimilated data. Hexagons represent atmosphere states, black arrows represent data assimilation, brown solid arrows represent 6-hour prediction, brown dashed arrows represent 48-hour prediction, hexagons with dashed edges and sparse points represent sparse observations, and wide arrows point out targets and references to compare in each experiment.
Figure 5: Root mean square errors (RMSEs, shown by the numbers in the cell) of geopotential at 500hPa, temperature at 850hPa, and temperature at 2m from the single-step assimilated data, and from 6-hour to 48-hour GraphCast forecasts. The errors are calculated against the ERA5 data. The cells are color-coded with the RMSEs relative to the 48-hour forecast errors.
...and 18 more figures

DiffDA: a Diffusion Model for Weather-scale Data Assimilation

TL;DR

Abstract

DiffDA: a Diffusion Model for Weather-scale Data Assimilation

Authors

TL;DR

Abstract

Table of Contents

Figures (23)