Training-Free Data Assimilation with GenCast
Thomas Savary, François Rozet, Gilles Louppe
TL;DR
This work addresses Bayesian state estimation for dynamical systems without additional training by marrying pre-trained diffusion models with particle filters. It leverages GenCast as the diffusion prior and develops a training-free data assimilation workflow that samples from the optimal proposal using posterior-score decomposition, estimates mean dynamics with a denoiser, and computes weights via a Dirac-approximation, implemented through a Fully-Adapted Auxiliary Particle Filter with inflation. Empirical results on a GenCast-based global weather setting with 256 particles show that FA-APF yields stable skill for both observed and unobserved variables, outperforming unconditional GenCast forecasts while maintaining nonzero ensemble spread. The approach is lightweight and broadly applicable to autoregressive diffusion models, enabling operational data assimilation without retraining and offering a natural path to diffusion-based reanalysis in the future.
Abstract
Data assimilation is widely used in many disciplines such as meteorology, oceanography, and robotics to estimate the state of a dynamical system from noisy observations. In this work, we propose a lightweight and general method to perform data assimilation using diffusion models pre-trained for emulating dynamical systems. Our method builds on particle filters, a class of data assimilation algorithms, and does not require any further training. As a guiding example throughout this work, we illustrate our methodology on GenCast, a diffusion-based model that generates global ensemble weather forecasts.
