A non-intrusive machine learning framework for debiasing long-time coarse resolution climate simulations and quantifying rare events statistics

Benedikt Barthel Sorensen; Alexis Charalampopoulos; Shixuan Zhang; Bryce Harrop; Ruby Leung; Themistoklis Sapsis

A non-intrusive machine learning framework for debiasing long-time coarse resolution climate simulations and quantifying rare events statistics

Benedikt Barthel Sorensen, Alexis Charalampopoulos, Shixuan Zhang, Bryce Harrop, Ruby Leung, Themistoklis Sapsis

TL;DR

The paper tackles the bias in coarse-resolution climate models caused by unresolved sub-grid processes and the challenge of estimating extreme-event statistics over long return periods. It introduces a non-intrusive neural-network correction operator $\mathcal{G}$ trained on nudged coarse trajectories that are steered toward high-fidelity references, enabling accurate long-time statistics without altering the underlying dynamics. The method is demonstrated on a simplified quasi-geostrophic model and on the Energy Exascale Earth System Model (E3SM), achieving substantial improvements in global and regional statistics, and a 51% reduction in mean Integrated Vapor Transport biases relative to ERA5 reference data. The approach offers a stable, post-processing framework applicable across domains, capable of quantifying rare-event statistics and potentially informing climate risk assessments and policy decisions.

Abstract

Due to the rapidly changing climate, the frequency and severity of extreme weather is expected to increase over the coming decades. As fully-resolved climate simulations remain computationally intractable, policy makers must rely on coarse-models to quantify risk for extremes. However, coarse models suffer from inherent bias due to the ignored "sub-grid" scales. We propose a framework to non-intrusively debias coarse-resolution climate predictions using neural-network (NN) correction operators. Previous efforts have attempted to train such operators using loss functions that match statistics. However, this approach falls short with events that have longer return period than that of the training data, since the reference statistics have not converged. Here, the scope is to formulate a learning method that allows for correction of dynamics and quantification of extreme events with longer return period than the training data. The key obstacle is the chaotic nature of the underlying dynamics. To overcome this challenge, we introduce a dynamical systems approach where the correction operator is trained using reference data and a coarse model simulation nudged towards that reference. The method is demonstrated on debiasing an under-resolved quasi-geostrophic model and the Energy Exascale Earth System Model (E3SM). For the former, our method enables the quantification of events that have return period two orders longer than the training data. For the latter, when trained on 8 years of ERA5 data, our approach is able to correct the coarse E3SM output to closely reflect the 36-year ERA5 statistics for all prognostic variables and significantly reduce their spatial biases.

A non-intrusive machine learning framework for debiasing long-time coarse resolution climate simulations and quantifying rare events statistics

TL;DR

trained on nudged coarse trajectories that are steered toward high-fidelity references, enabling accurate long-time statistics without altering the underlying dynamics. The method is demonstrated on a simplified quasi-geostrophic model and on the Energy Exascale Earth System Model (E3SM), achieving substantial improvements in global and regional statistics, and a 51% reduction in mean Integrated Vapor Transport biases relative to ERA5 reference data. The approach offers a stable, post-processing framework applicable across domains, capable of quantifying rare-event statistics and potentially informing climate risk assessments and policy decisions.

Abstract

Paper Structure (21 sections, 27 equations, 18 figures, 1 table)

This paper contains 21 sections, 27 equations, 18 figures, 1 table.

Introduction
Training correction operators for imperfect chaotic systems
Quasi-Geostrophic Model
Background
Neural network architecture and training strategy
Results
Prediction of long time statistics
Minimum training data requirement
Evaluation for different flow parameters than the training data
Global Climate Model
Dataset
Neural network architecture and training strategy
Results
Global statistics
Integrated Vapor Transport
...and 6 more sections

Figures (18)

Figure 1: Description of the method that learns a map between the attractor of the coarsely-resolved equations and the attractor of the reference trajectory. Left: the red dashed curve represents the reference trajectory. The black curve is a coarsely-resolved nudged trajectory towards the reference trajectory. The green curve is the free-run coarsely-resolved trajectory that is not used for training (shown for reference). Right: the target attractor and the target trajectory (red), same as the dashed curve shown at the left plot.
Figure 2: Workflow of the training process (top) and testing process (bottom) for the machine learning of correction operators and their application on the generation of long time climate simulations, i.e. longer than the reference dataset.
Figure 3: Example zonally averaged stream function $\hat{\psi}_1$ of the QG system (\ref{['eq:QG_model']}) for $\beta = 2.0$ and $r = 0.1$. From top to bottom: fully resolved, i.e. reference solution (RD), free-running coarse simulation (CR), spectrally corrected nudged simulation (NC).
Figure 4: Model prediction for $\beta = 2.0$ and $r = 0.1$. Power spectrum and probability density function of stream functions $\psi_1$ (top row) and $\psi_2$ (bottom row). Test data, RD (solid black), CR (dash black), ML(CR) (blue) and training data $\mathrm{RD_{train}}$ (red) (a). Zonally averaged stream function $\Bar{\psi}_1$, RD (upper panel) and ML(CR) (lower panel) (b). $T_{train} = 1,000$ and $T_{test} = 34,000$.
Figure 5: Probability density function of individual Fourier modes for $\beta = 2.0$ and $r = 0.1$. RD (solid black), CR (dashed black), ML(CR) (blue) . $T_{train} = 1,000$ and $T_{test} = 34,000$.
...and 13 more figures

A non-intrusive machine learning framework for debiasing long-time coarse resolution climate simulations and quantifying rare events statistics

TL;DR

Abstract

A non-intrusive machine learning framework for debiasing long-time coarse resolution climate simulations and quantifying rare events statistics

Authors

TL;DR

Abstract

Table of Contents

Figures (18)