Table of Contents
Fetching ...

ArchesClimate: Probabilistic Decadal Ensemble Generation With Flow Matching

Graham Clyne, Guillaume Couairon, Guillaume Gastineau, Claire Monteleoni, Anastase Charantonis

TL;DR

ArchesClimate tackles the high cost of decadal climate ensembles by learning a probabilistic emulator from IPSL-CM6A-LR decadal hindcasts. It combines a deterministic mean forecaster with a flow-matching generative residual to auto-regress monthly states with lead time $\delta = 1$ month, producing long, physically consistent sequences up to 10 years. For conditioning, forcings are included via conditional layer normalization for CO2, CH4, CFC11eq, N2O and SSI, and training uses a held-out IPSL-DCPP split to enable fair comparison. The results show competitive probabilistic skill, with CRPS close to the IPSL-DCPP ensemble for several variables and clear potential to reduce computational cost while preserving key climate features and teleconnections.

Abstract

Climate projections have uncertainties related to components of the climate system and their interactions. A typical approach to quantifying these uncertainties is to use climate models to create ensembles of repeated simulations under different initial conditions. Due to the complexity of these simulations, generating such ensembles of projections is computationally expensive. In this work, we present ArchesClimate, a deep learning-based climate model emulator that aims to reduce this cost. ArchesClimate is trained on decadal hindcasts of the IPSL-CM6A-LR climate model at a spatial resolution of approximately 2.5x1.25 degrees. We train a flow matching model following ArchesWeatherGen, which we adapt to predict near-term climate. Once trained, the model generates states at a one-month lead time and can be used to auto-regressively emulate climate model simulations of any length. We show that for up to 10 years, these generations are stable and physically consistent. We also show that for several important climate variables, ArchesClimate generates simulations that are interchangeable with the IPSL model. This work suggests that climate model emulators could significantly reduce the cost of climate model simulations.

ArchesClimate: Probabilistic Decadal Ensemble Generation With Flow Matching

TL;DR

ArchesClimate tackles the high cost of decadal climate ensembles by learning a probabilistic emulator from IPSL-CM6A-LR decadal hindcasts. It combines a deterministic mean forecaster with a flow-matching generative residual to auto-regress monthly states with lead time month, producing long, physically consistent sequences up to 10 years. For conditioning, forcings are included via conditional layer normalization for CO2, CH4, CFC11eq, N2O and SSI, and training uses a held-out IPSL-DCPP split to enable fair comparison. The results show competitive probabilistic skill, with CRPS close to the IPSL-DCPP ensemble for several variables and clear potential to reduce computational cost while preserving key climate features and teleconnections.

Abstract

Climate projections have uncertainties related to components of the climate system and their interactions. A typical approach to quantifying these uncertainties is to use climate models to create ensembles of repeated simulations under different initial conditions. Due to the complexity of these simulations, generating such ensembles of projections is computationally expensive. In this work, we present ArchesClimate, a deep learning-based climate model emulator that aims to reduce this cost. ArchesClimate is trained on decadal hindcasts of the IPSL-CM6A-LR climate model at a spatial resolution of approximately 2.5x1.25 degrees. We train a flow matching model following ArchesWeatherGen, which we adapt to predict near-term climate. Once trained, the model generates states at a one-month lead time and can be used to auto-regressively emulate climate model simulations of any length. We show that for up to 10 years, these generations are stable and physically consistent. We also show that for several important climate variables, ArchesClimate generates simulations that are interchangeable with the IPSL model. This work suggests that climate model emulators could significantly reduce the cost of climate model simulations.

Paper Structure

This paper contains 23 sections, 13 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: On the right, a visualization of one state ($X_t$) from ISPL-DCPP, with surface and oceanic variables separated from atmospheric variables. Globally averaged and normalized external forcings are shown as a vector to the right.
  • Figure 2: Deterministic and Generative training schemes for ArchesClimate. It is necessary to have fully trained $f_{\theta}$ before training $g_{\theta}$. $f_{\theta}$ learns a strong prior of the mean climate, while $g_{\theta}$ learns the residuals of the learned mean climate.
  • Figure 3: Sampling with ArchesClimate. Initial states and noise are given to $g_{\theta}$ and slowly shift from noise to the data distribution over 24 inference timesteps. The combined result of $f_{\theta}$ and $g_{\theta}$ are then used as input for the following timestep $t$.
  • Figure 4: Ensemble means of the full state (top) and ensemble means for anomalies (bottom) of the Tropics (20° S – 20° N, 0° – 360° E) for the test period 1969-1979. The dotted lines are the maximum and minimums for each 5-member ensemble mean, with the shaded area being +/- 1 standard deviation. Closer to the dotted line is better.
  • Figure 5: Variance (top) and CRPS (bottom) for different training schemes for the decade 1969-1979. residual-flow is described in \ref{['sec:training']}. full-flow trains without any deterministic model. full-deterministic uses only the deterministic model to make predictions. The dotted line represents the 5-member IPSL-DCPP baseline.
  • ...and 6 more figures