Table of Contents
Fetching ...

Retrofitting Earth System Models with Cadence-Limited Neural Operator Updates

Aniruddha Bora, Shixuan Zhang, Khemraj Shukla, Bryce Harrop, George Em. Karniadakis, L. Ruby Leung

TL;DR

This work presents cadence-limited online bias corrections for a legacy Earth system model by learning neural-operator updates that map instantaneous states to nudging tendencies. It introduces two FiLM-conditioned UNet-inspired architectures, IUNet and M&M, with multiscale, fully differentiable upsampling designed for online deployment in EAMv2. Offline tests show strong generalization and that M&M offers the best overall fidelity; online integrations demonstrate stable, 2–10% RMSE reductions across key fields, with IUNet and M&M providing the most robust improvements. The study emphasizes stability, portability, and computational feasibility, laying groundwork for extended training and online-learning strategies to further retrofit legacy climate models with expressive neural operators.

Abstract

Coarse resolution, imperfect parameterizations, and uncertain initial states and forcings limit Earth-system model (ESM) predictions. Traditional bias correction via data assimilation improves constrained simulations but offers limited benefit once models run freely. We introduce an operator-learning framework that maps instantaneous model states to bias-correction tendencies and applies them online during integration. Building on a U-Net backbone, we develop two operator architectures Inception U-Net (IUNet) and a multi-scale network (M\&M) that combine diverse upsampling and receptive fields to capture multiscale nonlinear features under Energy Exascale Earth System Model (E3SM) runtime constraints. Trained on two years E3SM simulations nudged toward ERA5 reanalysis, the operators generalize across height levels and seasons. Both architectures outperform standard U-Net baselines in offline tests, indicating that functional richness rather than parameter count drives performance. In online hybrid E3SM runs, M\&M delivers the most consistent bias reductions across variables and vertical levels. The ML-augmented configurations remain stable and computationally feasible in multi-year simulations, providing a practical pathway for scalable hybrid modeling. Our framework emphasizes long-term stability, portability, and cadence-limited updates, demonstrating the utility of expressive ML operators for learning structured, cross-scale relationships and retrofitting legacy ESMs.

Retrofitting Earth System Models with Cadence-Limited Neural Operator Updates

TL;DR

This work presents cadence-limited online bias corrections for a legacy Earth system model by learning neural-operator updates that map instantaneous states to nudging tendencies. It introduces two FiLM-conditioned UNet-inspired architectures, IUNet and M&M, with multiscale, fully differentiable upsampling designed for online deployment in EAMv2. Offline tests show strong generalization and that M&M offers the best overall fidelity; online integrations demonstrate stable, 2–10% RMSE reductions across key fields, with IUNet and M&M providing the most robust improvements. The study emphasizes stability, portability, and computational feasibility, laying groundwork for extended training and online-learning strategies to further retrofit legacy climate models with expressive neural operators.

Abstract

Coarse resolution, imperfect parameterizations, and uncertain initial states and forcings limit Earth-system model (ESM) predictions. Traditional bias correction via data assimilation improves constrained simulations but offers limited benefit once models run freely. We introduce an operator-learning framework that maps instantaneous model states to bias-correction tendencies and applies them online during integration. Building on a U-Net backbone, we develop two operator architectures Inception U-Net (IUNet) and a multi-scale network (M\&M) that combine diverse upsampling and receptive fields to capture multiscale nonlinear features under Energy Exascale Earth System Model (E3SM) runtime constraints. Trained on two years E3SM simulations nudged toward ERA5 reanalysis, the operators generalize across height levels and seasons. Both architectures outperform standard U-Net baselines in offline tests, indicating that functional richness rather than parameter count drives performance. In online hybrid E3SM runs, M\&M delivers the most consistent bias reductions across variables and vertical levels. The ML-augmented configurations remain stable and computationally feasible in multi-year simulations, providing a practical pathway for scalable hybrid modeling. Our framework emphasizes long-term stability, portability, and cadence-limited updates, demonstrating the utility of expressive ML operators for learning structured, cross-scale relationships and retrofitting legacy ESMs.

Paper Structure

This paper contains 24 sections, 22 equations, 22 figures, 7 tables.

Figures (22)

  • Figure 1: Workflow and learning task (M&M with FiLM conditioning).(A)End-to-end workflow. ERA5 provides the reference data and the U.S. Department of Energy’s Energy Exascale Earth System Model (E3SM) provides simulated states. Controlled (nudged) E3SM runs toward ERA5 yield paired model states and nudging tendencies used to train the ML operator, which is then coupled back into E3SM for online bias correction during free-running simulations (no reference data). (B)Inputs and targets.(a) Pre-nudging state variables zonal wind $U$, meridional wind $V$, temperature $T$, and specific humidity $Q$ serve as inputs. (b) The learned model maps these states to the corresponding nudging tendencies. The central network schematic in (B) depicts the M&M architecture with FiLM scalar conditioning; full architectural details are provided in the Methods, and expanded layer-level diagrams appear in the Supplementary Information (see Supplementary Figures S14 and S15).
  • Figure 2: Offline comparison of mean wind nudging tendencies across neural-operator architectures. Panels (a–e) show zonal wind correction tendencies (UTEND) and panels (f–j) show meridional wind correction tendencies (VTEND), each averaged over all vertical levels and over the year 2015. For each component, global maps from UNet, UNet with increased parameters (UNetMP), IUNet, and the multi-branch full-rank decoder (M&M) are shown alongside the reference nudging tendencies from the training data (“Truth”). Values within each panel denote the spatial pattern correlation (PCC, Pearson correlation) with the reference field, summarizing large-scale spatial agreement. Across both UTEND and VTEND, M&M exhibits the strongest correspondence with the reference tendencies, followed by UNetMP, while UNet and IUNet show more mixed performance. All models were trained on the same dataset with identical batch size, optimizer, and number of epochs to ensure a controlled comparison.
  • Figure 3: Offline comparison of vertical-layer skill for wind nudging tendencies across neural-operator architectures. Panels (a–h) show layer-wise temporal correlation coefficients (TCC; Pearson correlation over time at each vertical level) between predicted and reference nudging tendencies (training datasets) for zonal (UTEND; top row) and meridional (VTEND; bottom row) wind components. Columns correspond to different architectures: UNet, UNet with increased parameters (UNetMP), IUNet, and the multi-branch full-rank decoder (M&M). Positive correlations indicate better temporal tracking of reference tendencies, whereas negative values reflect phase mismatch. The vertical-level index increases toward the surface (higher index = nearer-surface levels). Panels (i–j) show the spatial pattern correlation at each level, computed as the Pearson spatial correlation between predicted and reference fields at every time step and then averaged over 2015. Curves summarize this mean pattern correlation for UTEND (i) and VTEND (j), using the same vertical-level indexing as in panels (a–h). Higher values indicate better reproduction of the spatial structure at each level.
  • Figure 4: Global distributions of annual-mean nudging tendencies. Zonal wind (UTEND, m s$^{-1}$ 3 hr$^{-1}$; panels a1--a5), meridional wind (VTEND, m s$^{-1}$ 3 hr$^{-1}$; panels b1--b5), air temperature (TTEND, K 3 hr$^{-1}$; panels c1--c5), and specific humidity (QTEND, g kg$^{-1}$ 3 hr$^{-1}$; panels d1--d5) are averaged over all model levels and the 5 years of 2012--2016. The first column shows nudging tendencies from the reference simulation (Nudge), which serves as the target for model comparison. Columns two through five show results from different model configurations, including the free-running baseline (CTRL) and machine learning--corrected simulations (UNet, UNetMP, IUNet, and M&M), as summarized in Supplementary Table S5. Color contours represent the magnitude and sign of the vertically averaged nudging tendencies, with cool colors indicating negative values and warm colors positive values. Pattern Correlation Coefficients (PCCs) relative to the reference are shown in the upper right corner of each panel to quantify spatial agreement. All tendencies are scaled to 3-hourly values for consistency. Details of the simulation configurations are provided in Supplementary Table S5.
  • Figure 5: Seasonal root-mean-square error (RMSE) averaged over 2012--2016. Shown are the atmospheric state variables at pressure levels (top) and surface levels (bottom). Each square is divided into four triangles representing DJF (December--February), MAM (March--May), JJA (June--August), and SON (September--November). Colors indicate the percent difference in RMSE relative to the free-running EAMv2 baseline simulation (CLIM), with negative values indicating a reduction (improvement) and positive values an increase (degradation). Rows correspond to experiments and columns to variables (labels follow standard CMIP/CF aliases). Details of the simulation configurations are provided in Supplementary Table S5.
  • ...and 17 more figures