Table of Contents
Fetching ...

Technical Report: Towards Unified Diffusion Models for Multi-Model Climate Emulation at Scale

Francesco Immorlano, Elijah Tavares, Felix Draxler, Padhraic Smyth, Pierre Gentine, Stephan Mandt

TL;DR

The work tackles the computational bottleneck of forming large climate ensembles by introducing a unified conditional diffusion model that jointly emulates nine CMIP6 models across three SSP scenarios. It conditions on model identity $m$, CO$_2$e $c_s$, day $d$, and year $y$ to generate daily global temperature maps via the conditional distribution $P(T\mid m,c_s,d,y)$, enabling scalable probabilistic sampling and cross-model comparisons. Key contributions include (i) efficient probabilistic sampling for uncertainty quantification across models and scenarios, (ii) orders-of-magnitude speedups over traditional climate simulations, and (iii) variance-reduced treatment effect estimation using paired seeds that dramatically reduce the sample size needed for precise causal inferences. The approach generalizes to unseen futures, supports rapid policy-scenario exploration at regional scales, and offers a practical tool for impact assessment with well-calibrated full distributions, not just mean trajectories.

Abstract

Large ensembles of climate projections are essential for characterizing uncertainty in future climate and extreme weather events, yet computational constraints of numerical climate models limit ensemble sizes to a small number of realizations per model. We present a unified conditional diffusion model that dramatically reduces this computational barrier by learning shared distributional patterns across multiple Coupled Model Intercomparison Project phase 6 models and emission scenarios. Rather than training separate emulators for each model-scenario combination, our approach captures the common statistical structures underlying nine CMIP6 models, generating daily temperature maps with a global coverage for historical and future periods. This unified framework enables: (i) efficient probabilistic sampling for comprehensive uncertainty quantification across models and scenarios; (ii) rapid generation of large ensembles that would be computationally intractable with traditional climate models; (iii) variance-reduced treatment effect analysis via fixed-seed generation that disentangles forced climate responses from internal variability. Evaluations on held-out models demonstrate reliable generalization to unseen future climates, enabling rapid exploration of different emission pathways.

Technical Report: Towards Unified Diffusion Models for Multi-Model Climate Emulation at Scale

TL;DR

The work tackles the computational bottleneck of forming large climate ensembles by introducing a unified conditional diffusion model that jointly emulates nine CMIP6 models across three SSP scenarios. It conditions on model identity , COe , day , and year to generate daily global temperature maps via the conditional distribution , enabling scalable probabilistic sampling and cross-model comparisons. Key contributions include (i) efficient probabilistic sampling for uncertainty quantification across models and scenarios, (ii) orders-of-magnitude speedups over traditional climate simulations, and (iii) variance-reduced treatment effect estimation using paired seeds that dramatically reduce the sample size needed for precise causal inferences. The approach generalizes to unseen futures, supports rapid policy-scenario exploration at regional scales, and offers a practical tool for impact assessment with well-calibrated full distributions, not just mean trajectories.

Abstract

Large ensembles of climate projections are essential for characterizing uncertainty in future climate and extreme weather events, yet computational constraints of numerical climate models limit ensemble sizes to a small number of realizations per model. We present a unified conditional diffusion model that dramatically reduces this computational barrier by learning shared distributional patterns across multiple Coupled Model Intercomparison Project phase 6 models and emission scenarios. Rather than training separate emulators for each model-scenario combination, our approach captures the common statistical structures underlying nine CMIP6 models, generating daily temperature maps with a global coverage for historical and future periods. This unified framework enables: (i) efficient probabilistic sampling for comprehensive uncertainty quantification across models and scenarios; (ii) rapid generation of large ensembles that would be computationally intractable with traditional climate models; (iii) variance-reduced treatment effect analysis via fixed-seed generation that disentangles forced climate responses from internal variability. Evaluations on held-out models demonstrate reliable generalization to unseen future climates, enabling rapid exploration of different emission pathways.

Paper Structure

This paper contains 40 sections, 10 equations, 15 figures, 3 tables.

Figures (15)

  • Figure 1: (a) Daily temperature maps simulated by MPI and (b) sampled by the diffusion model under SSP3-7.0, averaged over 2070–2080; (c) difference map (diffusion model vs. MPI); (d,e) comparison of daily distributions for the grid points closest to Reykjavik and Los Angeles over 2070--2080.
  • Figure 2: (a) Global-mean daily temperatures simulated by MPI-ESM1-2-HR and generated by the unified diffusion model over 2070--2075 for the three SSPs used in this work; (b) same as (a), after subtracting the seasonal cycle
  • Figure 3: Distributional analysis of regional temperature shifts between SSPs 5-8.5 and 2-4.5 for October 17th, 2100, using 1,000 sample pairs generated by the diffusion model emulating MPI. For each sample pair, identical random seeds were used across scenarios to isolate the climate forcing signal. (a-c) Spatial distribution of statistical moments: (a) First moment shows mean temperature change and reveals expected regional warming patterns with Arctic amplification and land-ocean contrasts., (b) Second moment shows variance and quantifies uncertainty of temperature change, with highest values in polar regions and South America indicating less confident predictions; (c) Third cumulant shows asymmetry of temperature change distributions with near-zero skewness globally, confirming symmetric distributions without bias toward extreme outcomes. (d,e) Location-specific probability distributions for Reykjavik and Los Angeles, with Reykjavik's broader distribution reflecting greater uncertainty at high latitudes (dashed lines indicate mean temperatures).
  • Figure 4: Paired vs Unpaired Seeds: Treatment Effect Convergence and Efficiency (a) Spatial standard deviation of error from the reference ATE map as a function of sample size $n$. The reference is computed from 50,000 held-out paired samples. Treatment effects are computed between SSP5-8.5 and SSP2-4.5 for a single day and year. The paired estimator converges substantially faster, achieving equivalent precision with approximately 50--300$\times$ fewer samples. (b) Scatter plot visualizing estimator accuracy across all grid cells. Each point compares the estimated treatment effect to the reference at one location, with proximity to the diagonal indicating agreement. This shows a full distribution of estimation errors rather than a single summary statistic. Paired estimation with $n=190$ samples (blue) achieves comparable precision to unpaired estimation with $n=50,000$ samples (green), both achieving $r=1.00$. Unpaired estimation at the same $n=190$ (red) shows greater scatter ($r=0.96$), demonstrating the variance reduction from shared latent noise sampling.
  • Figure 5: a) Temperature distribution shifts between October 17th, 2015 and 2100 under SSP3-7.0 for the grid points closest to Reykjavik and Los Angeles. b) Temperature distribution shifts across SSP2-4.5, 3-7.0, and 5-8.5 on October 17th, 2100 for the same locations. In both cases, the probability density functions were derived from 1,000 samples generated by the diffusion model emulating MPI, with dashed vertical lines indicating mean temperatures.
  • ...and 10 more figures