Table of Contents
Fetching ...

Gradient-free online learning of subgrid-scale dynamics with neural emulators

Hugo Frezat, Ronan Fablet, Guillaume Balarac, Julien Le Sommer

TL;DR

The paper addresses online training of subgrid-scale (SGS) parametrizations for non-differentiable climate solvers by introducing a neural emulator of the coarse solver to enable gradient-based optimization. It presents a two-step training scheme that separately learns a differentiable emulator and an SGS parametrization with compensated losses to minimize bias arising from emulator errors. Demonstrations on a chaotic two-timescale Lorenz-96 system and a barotropic quasi-geostrophic flow show that emulator-based online training can approach the performance of true online training without requiring solver adjoints, with emulator quality and loss design playing pivotal roles. The work highlights potential for applying gradient-free online strategies to larger, production-scale climate models and motivates further improvements in emulator architectures and evaluation metrics. Overall, it advances a practical route toward stable, data-driven SGS closures in earth-system modeling.

Abstract

In this paper, we propose a generic algorithm to train machine learning-based subgrid parametrizations online, i.e., with \textit{a posteriori} loss functions, but for non-differentiable numerical solvers. The proposed approach leverages a neural emulator to approximate the reduced state-space solver, which is then used to allow gradient propagation through temporal integration steps. We apply this methodology on a chaotic two-timescales Lorenz-96 system and a single layer quasi-geostrophic system with zonal dynamics, known to be highly unstable with offline strategies. Using our algorithm, we are able to train a parametrization that recovers most of the benefits of online strategies without having to compute the gradient of the original solver. We found that training the neural emulator and parametrization components separately with different loss quantities is necessary in order to minimize the propagation of approximation biases. Experiments on emulator architectures with different complexities also indicates that emulator performance is key in order to learn an accurate parametrization. This work is a step towards learning parametrization with online strategies for climate models.

Gradient-free online learning of subgrid-scale dynamics with neural emulators

TL;DR

The paper addresses online training of subgrid-scale (SGS) parametrizations for non-differentiable climate solvers by introducing a neural emulator of the coarse solver to enable gradient-based optimization. It presents a two-step training scheme that separately learns a differentiable emulator and an SGS parametrization with compensated losses to minimize bias arising from emulator errors. Demonstrations on a chaotic two-timescale Lorenz-96 system and a barotropic quasi-geostrophic flow show that emulator-based online training can approach the performance of true online training without requiring solver adjoints, with emulator quality and loss design playing pivotal roles. The work highlights potential for applying gradient-free online strategies to larger, production-scale climate models and motivates further improvements in emulator architectures and evaluation metrics. Overall, it advances a practical route toward stable, data-driven SGS closures in earth-system modeling.

Abstract

In this paper, we propose a generic algorithm to train machine learning-based subgrid parametrizations online, i.e., with \textit{a posteriori} loss functions, but for non-differentiable numerical solvers. The proposed approach leverages a neural emulator to approximate the reduced state-space solver, which is then used to allow gradient propagation through temporal integration steps. We apply this methodology on a chaotic two-timescales Lorenz-96 system and a single layer quasi-geostrophic system with zonal dynamics, known to be highly unstable with offline strategies. Using our algorithm, we are able to train a parametrization that recovers most of the benefits of online strategies without having to compute the gradient of the original solver. We found that training the neural emulator and parametrization components separately with different loss quantities is necessary in order to minimize the propagation of approximation biases. Experiments on emulator architectures with different complexities also indicates that emulator performance is key in order to learn an accurate parametrization. This work is a step towards learning parametrization with online strategies for climate models.
Paper Structure (14 sections, 41 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 14 sections, 41 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Evolution of the deviation (based on the root mean squared error) between reference trajectory $X_k$ and the perturbed ensemble $\delta X_k$ for the L96 system (left). The dashed lines shows the approximate decorrelation time $t_c$. Evolution of the mean state $\langle X_k \rangle_k$ throughout the 25 trajectories (right) for the reference solver $f$ (blue), coarse solver $g$ with restarts at each sub-trajectory (indicated by a circle), and neural emulator $\mathcal{E}$ trained to reproduce the dynamics of $g$.
  • Figure 2: State-averaged cumulative error $E(t)$ (left) and Wasserstein distance $W_1(X_k \sim P, \hat{X}_k \sim Q)$ (right) for 100 decorrelation times $t_c$ (or 30000 iterations with coarse solver $g$) for the L96 system.
  • Figure 3: Evolution of the deviation (based on the root mean squared error) between reference trajectory $\omega(t)$ and the perturbed ensemble $\delta \omega(t)$ for the QG system (left). The dashed lines shows the approximate decorrelation time $t_c$. Vorticity field $\omega$ (center) and filtered vorticity field $\bar{\omega}$ (right) at the end of the spin-up.
  • Figure 4: A priori evaluation of the different models used in this study. The test set is composed of 2500 filtered samples from a DNS performed after spin-up, i.e., prolonging the initial trajectory. Averaged $pdf$ of the SGS term $\tau_\omega$ (left) and energy fluxes due to the SGS term $\partial_t \partial E(k)_\text{sgs}$ (right).
  • Figure 5: Evolution of kinetic energy $E(t)$ (left) and enstrophy $Z(t)$ (right) for the different models used in this study. Simulations are integrated for almost $300$ turnover times $t_L$.
  • ...and 3 more figures