Table of Contents
Fetching ...

AI-boosted rare event sampling to characterize extreme weather

Amaury Lancelin, Alex Wikner, Laurent Dubus, Clément Le Priol, Dorian S. Abbot, Freddy Bouchet, Pedram Hassanzadeh, Jonathan Weare

TL;DR

AI+RES fuses RES, an AI weather emulator, and a GCM to efficiently sample rare extreme weather events. By defining the RES score as the AI-emulator ensemble mean forecast of the target observable $A_L(t_f)$, the method achieves unbiased return-period estimates with orders-of-magnitude speed-ups—up to $\mathcal{O}(100)$ fold for mid-latitude heatwaves in PlaSim, using only $N=400$ walkers to recover events up to $T_a=5\times10^4$ years. The framework produces physically realistic trajectories and uncertainty estimates, outperforming standard RES and enabling exploration of event dynamics and precursors. The approach is general and scalable to other extremes and GCMs, offering a practical path to robust tail statistics in climate science and beyond.

Abstract

Assessing the frequency and intensity of extreme weather events, and understanding how climate change affects them, is crucial for developing effective adaptation and mitigation strategies. However, observational datasets are too short and physics-based global climate models (GCMs) are too computationally expensive to obtain robust statistics for the rarest, yet most impactful, extreme events. AI-based emulators have shown promise for predictions at weather and even climate timescales, but they struggle on extreme events with few or no examples in their training dataset. Rare event sampling (RES) algorithms have previously demonstrated success for some extreme events, but their performance depends critically on a hard-to-identify "score function", which guides efficient sampling by a GCM. Here, we develop a novel algorithm, AI+RES, which uses ensemble forecasts of an AI weather emulator as the score function to guide highly efficient resampling of the GCM and generate robust (physics-based) extreme weather statistics and associated dynamics at 30-300x lower cost. We demonstrate AI+RES on mid-latitude heatwaves, a challenging test case requiring a score function with predictive skill many days in advance. AI+RES, which synergistically integrates AI, RES, and GCMs, offers a powerful, scalable tool for studying extreme events in climate science, as well as other disciplines in science and engineering where rare events and AI emulators are active areas of research.

AI-boosted rare event sampling to characterize extreme weather

TL;DR

AI+RES fuses RES, an AI weather emulator, and a GCM to efficiently sample rare extreme weather events. By defining the RES score as the AI-emulator ensemble mean forecast of the target observable , the method achieves unbiased return-period estimates with orders-of-magnitude speed-ups—up to fold for mid-latitude heatwaves in PlaSim, using only walkers to recover events up to years. The framework produces physically realistic trajectories and uncertainty estimates, outperforming standard RES and enabling exploration of event dynamics and precursors. The approach is general and scalable to other extremes and GCMs, offering a practical path to robust tail statistics in climate science and beyond.

Abstract

Assessing the frequency and intensity of extreme weather events, and understanding how climate change affects them, is crucial for developing effective adaptation and mitigation strategies. However, observational datasets are too short and physics-based global climate models (GCMs) are too computationally expensive to obtain robust statistics for the rarest, yet most impactful, extreme events. AI-based emulators have shown promise for predictions at weather and even climate timescales, but they struggle on extreme events with few or no examples in their training dataset. Rare event sampling (RES) algorithms have previously demonstrated success for some extreme events, but their performance depends critically on a hard-to-identify "score function", which guides efficient sampling by a GCM. Here, we develop a novel algorithm, AI+RES, which uses ensemble forecasts of an AI weather emulator as the score function to guide highly efficient resampling of the GCM and generate robust (physics-based) extreme weather statistics and associated dynamics at 30-300x lower cost. We demonstrate AI+RES on mid-latitude heatwaves, a challenging test case requiring a score function with predictive skill many days in advance. AI+RES, which synergistically integrates AI, RES, and GCMs, offers a powerful, scalable tool for studying extreme events in climate science, as well as other disciplines in science and engineering where rare events and AI emulators are active areas of research.

Paper Structure

This paper contains 21 sections, 17 equations, 17 figures, 3 tables, 2 algorithms.

Figures (17)

  • Figure 1: Schematic of the AI+RES framework. We train an AI weather emulator on 100 years of PlaSim GCM data, then use it to guide a RES algorithm. (a) In our proposed method (AI+RES), we run a PlaSim ensemble simulation with $N$ parallel simulations, called walkers and denoted $X_t^i$. (b) At each resampling time $t_k$ in the algorithm (vertical dashed lines), we perform an ensemble forecast with the AI emulator for each PlaSim walker until the target time $t_f$, then duplicate the more promising ones based on these forecasts (Algorithm \ref{['alg:dmc']} and Eq. \ref{['eq:score_function']}). This allows us to both generate a large catalog of extreme events from physics-based simulations (c) and to compute unbiased probabilities for these rare events (Eq. \ref{['eq:unbiased_estimator']}) (d) at a fraction of the cost of Direct Numerical Simulation (DNS). An example of schematic (a) with actual data from an AI+RES experiment can be found in Extended Data Fig. \ref{['fig:traj_and_hist']}.
  • Figure 1: Map of the regions of interest in this study. The left panel shows the region of France, and the right panel shows the region of Chicago. Each region is a box of $3 \times 3$ pixels (for PlaSim resolution, each pixel is approximately 2.8 degrees). We chose to study mid-latitude heatwaves over France because previous studies using PlaSim or RES focused on this region ragone2018computationmiloshevich_probabilistic_2023-1. We also selected the Chicago region, as it is one of the hottest areas in summer in the PlaSim world, with the aim of sampling the most extreme events possible.
  • Figure 2: Return-period curve for the T$_{2m}$ 7-day average over France (left panel) and Chicago (right panel). Black dots are obtained with a $N=50,000$ member DNS and serve as ground truth. The red dots are from DNS, but with the same computational budget as AI+RES ($N=400$). The cyan dots are obtained by running a $N=50,000$ member ensemble with the AI emulator only (AI-DNS). The solid blue line shows the median return-period curve produced by 10 independent realizations of the AI+RES algorithm with $N=400$ walkers and the blue shaded area shows the 10th and 90th percentiles. The solid yellow line shows the median return-period curve produced by 10 independent realizations of the standard RES algorithm with $N=400$ walkers, and the yellow shaded area shows the 10th and 90th percentiles. The gray shaded area (EVT) is obtained by fitting different GPD distributions with 100 independent training datasets of size $N=400$, and showing the 10th and 90th percentiles as confidence interval.
  • Figure 2: Weather skill of the AI emulator. Global RMSE (top) and ACC (bottom) as a function of lead time for T$_{2m}$ (left) and Z$_{500}$ (right). Gold and red lines show the AI emulator with a single-member forecast and a 100-member ensemble forecast, respectively. The blue line corresponds to a persistence forecast. Horizontal black lines in the RMSE panels indicate the climatological forecast. Dashed horizontal lines in the ACC panels at 0.6 mark the conventional threshold below which forecasts are no longer considered skillful.
  • Figure 3: Speed-up factors for the AI+RES algorithm over France and Chicago. Left panel: Variance speed-up factor as a function of return period (Eq. \ref{['eq:variance_speed_up_ratio']}). This measures the reduction in the variance of return period estimates compared to DNS (see Methods). Right panel: Sampling speed-up factor as a function of return period (Eq. \ref{['eq:sampling_speed_up_ratio']}), averaged over 10 independent algorithm realizations. This measures the gain in the number of extreme samples produced by the algorithm compared to DNS (see Methods).
  • ...and 12 more figures