Appa: Bending Weather Dynamics with Latent Diffusion Models for Global Data Assimilation
Gérôme Andry, Sacha Lewin, François Rozet, Omer Rochman, Victor Mangeleer, Matthias Pirlet, Elise Faulx, Marilaure Grégoire, Gilles Louppe
TL;DR
This work addresses global data assimilation by seeking the posterior trajectory p(x^{1:L} | y) given observations y. Appa introduces a latent diffusion framework that compresses atmospheric states x^i to latent z^i via a learned autoencoder with z^i ∼ N(Eψ(x^i), σ_z^2 I) and uses a diffusion transformer to model trajectories in latent space. Conditioning on observations is achieved by inserting the posterior score ∇_{z^{1:L}_t} log p(z^{1:L}_t | y) = ∇_{z^{1:L}_t} log p(z^{1:L}_t) + ∇_{z^{1:L}_t} log p(y | z^{1:L}_t) and approximating the likelihood p(y^{1:L} | z^{1:L}) through the decoder Dψ and a measurement operator M, yielding p(y^{1:L} | z^{1:L}) ≈ N(y^{1:L} | A(z^{1:L}), Σ_y) with A(z^{1:L}) = (M^1(Dψ(z^1)) … M^L(Dψ(z^L)))^T. Empirical results on ERA5 (1993–2021) show RMSEs below 0.1 after standardization, preservation of energy spectra, and physical relationships such as altitude estimation and geostrophic balance, with short-lead forecasts achieving skill comparable to IFS and better than GraphDOP, demonstrating the utility of a probabilistic, unified latent DA framework.
Abstract
Deep learning has advanced weather forecasting, but accurate predictions first require identifying the current state of the atmosphere from observational data. In this work, we introduce Appa, a score-based data assimilation model generating global atmospheric trajectories at 0.25\si{\degree} resolution and 1-hour intervals. Powered by a 565M-parameter latent diffusion model trained on ERA5, Appa can be conditioned on arbitrary observations to infer plausible trajectories, without retraining. Our probabilistic framework handles reanalysis, filtering, and forecasting, within a single model, producing physically consistent reconstructions from various inputs. Results establish latent score-based data assimilation as a promising foundation for future global atmospheric modeling systems.
