Latent Diffusion Model for Generating Ensembles of Climate Simulations
Johannes Meuer, Maximilian Witte, Tobias Sebastian Finn, Claudia Timmreck, Thomas Ludwig, Christopher Kadow
TL;DR
The paper tackles the high computational cost of generating large ensembles for climate uncertainty by introducing a latent diffusion framework that operates in a compressed latent space. It combines a pre-trained variational autoencoder (VAE) for dimensionality reduction with a denoising diffusion model (DDM) that learns latent residuals $$z_y = z - z_c$$ conditioned on a base latent $z_c = E(x_c)$, reconstructing samples as $$ hat{x} = D(z_c + hat{z}_y)$$ after diffusion-based generation. Two sequence-generation strategies are explored: an autoregressive method that incrementally predicts the next latent state and a transformer-based attention mechanism that processes the full time domain, enabling long-horizon climate simulations with controlled memory usage. On the MPI Grand Ensemble, the transformer-based approach closely matches the original ensemble's mean and variability and captures major climate features such as ENSO events and volcanic-induced shifts, while autoregressive generation provides strong temporal continuity. This approach offers a scalable, memory-efficient alternative for uncertainty quantification in climate projections and can be extended to additional variables, resolutions, and models to enhance decision-relevant climate risk assessments.
Abstract
Obtaining accurate estimates of uncertainty in climate scenarios often requires generating large ensembles of high-resolution climate simulations, a computationally expensive and memory intensive process. To address this challenge, we train a novel generative deep learning approach on extensive sets of climate simulations. The model consists of two components: a variational autoencoder for dimensionality reduction and a denoising diffusion probabilistic model that generates multiple ensemble members. We validate our model on the Max Planck Institute Grand Ensemble and show that it achieves good agreement with the original ensemble in terms of variability. By leveraging the latent space representation, our model can rapidly generate large ensembles on-the-fly with minimal memory requirements, which can significantly improve the efficiency of uncertainty quantification in climate simulations.
