Table of Contents
Fetching ...

Adjoint-based online learning of two-layer quasi-geostrophic baroclinic turbulence

Fei Er Yan, Hugo Frezat, Julien Le Sommer, Julian Mak, Karl Otness

TL;DR

Two online approaches are considered: A full adjoint‐based online approach, related to traditional adjoint optimization approaches that require a “differentiable” dynamical model, and an approximately online approach that approximates the adjoint calculation and does not require a differentiable dynamical model.

Abstract

For reasons of computational constraint, most global ocean circulation models used for Earth System Modeling still rely on parameterizations of sub-grid processes, and limitations in these parameterizations affect the modeled ocean circulation and impact on predictive skill. An increasingly popular approach is to leverage machine learning approaches for parameterizations, regressing for a map between the resolved state and missing feedbacks in a fluid system as a supervised learning task. However, the learning is often performed in an `offline' fashion, without involving the underlying fluid dynamical model during the training stage. Here, we explore the `online' approach that involves the fluid dynamical model during the training stage for the learning of baroclinic turbulence and its parameterization, with reference to ocean eddy parameterization. Two online approaches are considered: a full adjoint-based online approach, related to traditional adjoint optimization approaches that require a `differentiable' dynamical model, and an approximately online approach that approximates the adjoint calculation and does not require a differentiable dynamical model. The online approaches are found to be generally more skillful and numerically stable than offline approaches. Others details relating to online training, such as window size, machine learning model set up and designs of the loss functions are detailed to aid in further explorations of the online training methodology for Earth System Modeling.

Adjoint-based online learning of two-layer quasi-geostrophic baroclinic turbulence

TL;DR

Two online approaches are considered: A full adjoint‐based online approach, related to traditional adjoint optimization approaches that require a “differentiable” dynamical model, and an approximately online approach that approximates the adjoint calculation and does not require a differentiable dynamical model.

Abstract

For reasons of computational constraint, most global ocean circulation models used for Earth System Modeling still rely on parameterizations of sub-grid processes, and limitations in these parameterizations affect the modeled ocean circulation and impact on predictive skill. An increasingly popular approach is to leverage machine learning approaches for parameterizations, regressing for a map between the resolved state and missing feedbacks in a fluid system as a supervised learning task. However, the learning is often performed in an `offline' fashion, without involving the underlying fluid dynamical model during the training stage. Here, we explore the `online' approach that involves the fluid dynamical model during the training stage for the learning of baroclinic turbulence and its parameterization, with reference to ocean eddy parameterization. Two online approaches are considered: a full adjoint-based online approach, related to traditional adjoint optimization approaches that require a `differentiable' dynamical model, and an approximately online approach that approximates the adjoint calculation and does not require a differentiable dynamical model. The online approaches are found to be generally more skillful and numerically stable than offline approaches. Others details relating to online training, such as window size, machine learning model set up and designs of the loss functions are detailed to aid in further explorations of the online training methodology for Earth System Modeling.

Paper Structure

This paper contains 17 sections, 13 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Snapshots of upper-layer perturbation potential vorticity $q_1$ at the end of a ten year simulation. Data from the ($a$) high resolution model true, ($b$) low resolution model with no CNN active, ($c$) offline model, ($d$) online model ($e$) approximately online model.
  • Figure 2: Quantile-Quantile (Q-Q) plot, with the distribution of diagnosed target (the $S^q$ from high resolution filtered onto the coarse grid) on the $x$ axis, against the predicted $\hat{S}^q$ generated by the CNNs on the $y$ axis (green: offline model; orange: online (approx) model; blue: online model). ($a,b$) Results from a priori testing, and ($c,d$) a posteriori testing, for layer 1 and 2 respectively. The red line is the identity line $y=x$.
  • Figure 3: Depth-averaged total kinetic energy spectra, normalized by $n^2_x \times n^2_y$. All panels show the data from the target high resolution simulation (black line) and the low resolution simulation with no CNN active (grey line). Results from the ($a$) the offline model (green line), ($b$) full online model (blue line), and ($c$) the approximately online model (orange line). The colored lines are from a ten member ensemble averaged performance of the hybrid models, and the shading denotes the standard deviation of the ensemble.
  • Figure 4: Full spectral energy budget for the APE flux (blue line), APE generation from the imposed background shear (orange line), KE flux (green line), bottom friction (red line), dissipation from the spectral filter (purple line), and input from the CNNs (brown line). ($a$) The high resolution model truth. ($b$) The low resolution model with no CNNs active. ($c$) The offline hybrid model. ($d$) The online hybrid model. ($e$) The approximately online hybrid model. For ($c,d,e$), the colored lines are from the ten member ensemble averaged performance of the hybrid models, where the shading denotes the standard deviation of the ensemble.
  • Figure 5: Distribution similarity scores for potential vorticity $q$, both components of the velocity $u$ and $v$, the streamfunction $\psi$, KE and enstrophy, depth-averaged KE flux, APE flux, APE generation, and the bottom drag (labeled as Friction). Subscripts 1 and 2 denote the upper and lower layer, for the (green) offline model, (blue) full online model, and (orange) approximately online model. The solid line is for the ensemble average over the ten members, and the shading denotes the standard deviation of the ensemble.
  • ...and 3 more figures