Machine Learning of Vertical Fluxes by Unresolved Midlatitude Mesoscale Processes

Erisa Ismaili; Robert C. Jnglin Wills; Tom Beucler

Machine Learning of Vertical Fluxes by Unresolved Midlatitude Mesoscale Processes

Erisa Ismaili, Robert C. Jnglin Wills, Tom Beucler

TL;DR

The results demonstrate the importance of vertically non-local processes, clarify the regime-dependent predictability of mesoscale fluxes, and identify variables most informative for their parameterization, providing guidance for improving ESMs with ML and advancing the understanding of multi-scale interactions in the midlatitudes.

Abstract

Machine learning (ML) can represent processes unresolved in coarse-resolution Earth system models (ESMs) by learning from high-resolution climate data. Such ML parameterization approaches have been primarily tested in idealized setups where they have focused on deep convection. It remains largely unexplored whether these approaches could be used in a more targeted fashion to learn vertical fluxes resulting from midlatitude mesoscale processes, such as slantwise convection and frontal dynamics in extratropical cyclones, which are not well represented in ESMs. To address this, we employ a variable-resolution CESM2 simulation with a refined area over the North Atlantic (14-km grid refinement) that resolves such midlatitude mesoscale processes. We train an artificial neural network to predict vertical profiles of mesoscale moisture, heat, and momentum fluxes from the perspective of a coarse-resolution (111-km grid) model. Our results show that a large number of features are required to achieve reasonable model performance when data come from the midlatitudes of real-geography atmospheric simulations, especially when coarse-grained vertical velocities, which we show are not representative of vertical velocities in a coarse-resolution model, are excluded as inputs. Feature importance analysis reveals the importance of vertically non-local information in temperature, moisture, and the meridional wind. We suggest that these non-local relationships capture the influence of cold air outbreaks and fronts on mesoscale fluxes. Our results demonstrate the importance of vertically non-local processes, clarify the regime-dependent predictability of mesoscale fluxes, and identify variables most informative for their parameterization, providing guidance for improving ESMs with ML and advancing our understanding of multi-scale interactions in the midlatitudes.

Machine Learning of Vertical Fluxes by Unresolved Midlatitude Mesoscale Processes

TL;DR

Abstract

Paper Structure (25 sections, 6 equations, 16 figures, 6 tables)

This paper contains 25 sections, 6 equations, 16 figures, 6 tables.

Introduction
Data
Climate Model Data for Training
Selection of the Features and Targets
Sampling Strategy and Data Partitioning
Data Pre-Processing and Normalization
Machine Learning Methodology
Neural Network Architecture
Feature Importance and Localization Experiments
Shapley Analysis for Physical Interpretability
Results
Regime Definition via K-means Clustering
Performance of the ANN
Which Meteorological Variables Contribute Most to Model Skill?
Vertical Localization Relationship
...and 10 more sections

Figures (16)

Figure 1: Schematic explaining the methods for training an artificial neural network (ANN) to learn mesoscale atmospheric fluxes. A front-resolving simulation by wills_resolving_2024 provides the data with a refined resolution over the North Atlantic. We focus on the Gulf Stream region (red area), where columns of atmospheric variables are used to construct the dataset containing the features and targets to train the ANN. To focus on the troposphere, only 22 out of 32 levels are used. The 244 features contain the profile information of 11 atmospheric state variables and single-level information of the surface pressure and CAPE variables. The 88 targets contain the columns of the subgrid-scale fluxes of moisture, heat, and the horizontal momenta. Features and target variables are coarse-grained to a resolution of roughly 100 km. Using entire profile information for the feature and target variables is referred to as profile-to-profile mapping. Ablation experiments and XAI methods are used to assess feature importance.
Figure 2: Example event of an extratropical cyclone at model time stamp 0032-12-12 00:00:00 with a map of a) categorical regimes derived from a k-means clustering with coarse-grained variables (see Section \ref{['subsec:regime def']}), b) the high-resolution total precipitation and the sea level pressure anomalies, c) the high-resolution heat flux, d) the coarse-grained heat flux, e) the true subgrid-scale flux and f) the predicted subgrid-scale flux by the ANN. The selected level for the heat flux is the nearest level to 700 hPa.
Figure 3: Kernel density estimations for the cluster distribution of a few relevant features used in the K-means++ clustering algorithm as well as additional important variables which are not used for the clustering. a) includes features and variables that especially help to identify clusters 1, 2 and 4, b) contains the distributions of precipitation and the mesoscale fluxes of zonal wind and heat, which help identify clusters 0 and 3, and c) contains variables that help assess the relevance of CSI in the different clusters. The densities are normalized separately for each cluster, and each variable is standardized (but not centered) independently. Thus, the units in the x-axis are in standard deviation of the variables along the sample dimension. Note the logarithmic scales for precipitation and the subgrid-scale fluxes.
Figure 4: The main ANN model is skillful for profile-to-profile mappings. The bars represent the $R^2$ scores for every atmospheric level, where blue is the skill of the ANN and orange is the skill of the multiple linear regression (MLR) baseline. Additionally, for each level, the standard deviation of the test data is displayed in black, as well as the root mean square error (RMSE) of the ANN in blue and the RMSE of the MLR in orange. The units for these metrics are arbitrary due to the normalization. Uncertainty estimation is presented as a boxplot indicating the median, the interquartile range, and the whiskers of the distribution obtained by bootstrapping. The average pressure of the hybrid-coordinate vertical levels is shown on the y-axis.
Figure 5: Predicted vs. true scatter plots for Clusters 0-4. Each data point represents an output of a sample (any one of the subgrid fluxes from any level). The dashed diagonal line indicates the ideal case where predictions would perfectly match the true values. On the top marginal of the scatter plot, the histogram of the true values is displayed, while on the right marginal, the histogram of the predicted values is displayed. Both histograms show the counts in logarithmic scale. In the bottom right panel, the histograms of the mean square error for each cluster are displayed with counts in logarithmic scale.
...and 11 more figures

Machine Learning of Vertical Fluxes by Unresolved Midlatitude Mesoscale Processes

TL;DR

Abstract

Machine Learning of Vertical Fluxes by Unresolved Midlatitude Mesoscale Processes

Authors

TL;DR

Abstract

Table of Contents

Figures (16)