SEEDS: Emulation of Weather Forecast Ensembles with Diffusion Models
Lizao Li, Rob Carver, Ignacio Lopez-Gomez, Fei Sha, John Anderson
TL;DR
Uncertainty quantification in numerical weather prediction is challenged by the computational cost of large ensembles. SEEDS proposes diffusion-model emulation to generate weather-like ensembles conditioned on a small number of seeds, enabling both generative ensemble emulation and generative post-processing that can match or exceed physics-based ensembles in predictive skill and tail coverage, at a fraction of the computational cost. The approach uses a high-capacity axial-attention Transformer score model trained on two decades of GEFS reforecasts and ERA5 data, producing $N$ samples conditioned on $K$ seeds with throughput suitable for tens of thousands of members. This diffusion-based, scalable sampler offers a practical path to massive ensemble generation and bias correction, with potential extensions to climate projections and climate risk assessment.
Abstract
Uncertainty quantification is crucial to decision-making. A prominent example is probabilistic forecasting in numerical weather prediction. The dominant approach to representing uncertainty in weather forecasting is to generate an ensemble of forecasts. This is done by running many physics-based simulations under different conditions, which is a computationally costly process. We propose to amortize the computational cost by emulating these forecasts with deep generative diffusion models learned from historical data. The learned models are highly scalable with respect to high-performance computing accelerators and can sample hundreds to tens of thousands of realistic weather forecasts at low cost. When designed to emulate operational ensemble forecasts, the generated ones are similar to physics-based ensembles in important statistical properties and predictive skill. When designed to correct biases present in the operational forecasting system, the generated ensembles show improved probabilistic forecast metrics. They are more reliable and forecast probabilities of extreme weather events more accurately. While this work demonstrates the utility of the methodology by focusing on weather forecasting, the generative artificial intelligence methodology can be extended for uncertainty quantification in climate modeling, where we believe the generation of very large ensembles of climate projections will play an increasingly important role in climate risk assessment.
