Table of Contents
Fetching ...

Representing Subgrid-Scale Cloud Effects in a Radiation Parameterization using Machine Learning: MLe-radiation v1.0

Katharina Hafner, Sara Shamekh, Guillaume Bertoli, Axel Lauer, Robert Pincus, Julien Savre, Veronika Eyring

TL;DR

The paper tackles large uncertainties in radiative transfer due to subgrid-scale clouds in coarse-resolution Earth System Models by learning the cloud radiative impact (CRI) from high-resolution, global-storm-resolving simulations. It introduces a hybrid physics-ML radiation parameterization in which a BiLSTM neural network predicts the cloud contribution to heating rates from vertical state profiles, while a conventional physics-based scheme provides clear-sky fluxes; CRI is defined as the difference between all-sky and clear-sky heating rates, $\frac{\partial T_{CRI}}{\partial t} = \frac{\partial T_{all-sky}}{\partial t} - \frac{\partial T_{clear-sky}}{\partial t}$. Training uses 5 km QUBICC data coarse-grained to target resolutions (~80 km) and is evaluated against a McICA-based pyRTE+RRTMGP baseline across shortwave and longwave spectra, including fully and partially cloudy conditions and across multiple regions. The results show substantial reductions in heating-rate errors (factors of 4–11) with the ML-enhanced scheme, demonstrating the potential to encode subgrid-cloud variability into radiation schemes for next-generation Earth System Models. The study also discusses limitations due to unresolved shallow convection and aerosol absence in high-resolution data, and outlines online deployment and aerosol integration as future directions, highlighting a path toward more accurate, generalizable radiation parameterizations with potential computational savings.

Abstract

Improvements of Machine Learning (ML)-based radiation emulators remain constrained by the underlying assumptions to represent horizontal and vertical subgrid-scale cloud distributions, which continue to introduce substantial uncertainties. In this study, we introduce a method to represent the impact of subgrid-scale clouds by applying ML to learn processes from high-resolution model output with a horizontal grid spacing of 5km. In global storm resolving models, clouds begin to be explicitly resolved. Coarse-graining these high-resolution simulations to the resolution of coarser Earth System Models yields radiative heating rates that implicitly include subgrid-scale cloud effects, without assumptions about their horizontal or vertical distributions. We define the cloud radiative impact as the difference between all-sky and clear-sky radiative fluxes, and train the ML component solely on this cloud-induced contribution to heating rates. The clear-sky tendencies remain being computed with a conventional physics-based radiation scheme. This hybrid design enhances generalization, since the machine-learned part addresses only subgrid-scale cloud effects, while the clear-sky component remains responsive to changes in greenhouse gas or aerosol concentrations. Applied to coarse-grained data offline, the ML-enhanced radiation scheme reduces errors by a factor of 4-10 compared with a conventional coarse-scale radiation scheme. This shows the potential of representing subgrid-scale cloud effects in radiation schemes with ML for the next generation of Earth System Models.

Representing Subgrid-Scale Cloud Effects in a Radiation Parameterization using Machine Learning: MLe-radiation v1.0

TL;DR

The paper tackles large uncertainties in radiative transfer due to subgrid-scale clouds in coarse-resolution Earth System Models by learning the cloud radiative impact (CRI) from high-resolution, global-storm-resolving simulations. It introduces a hybrid physics-ML radiation parameterization in which a BiLSTM neural network predicts the cloud contribution to heating rates from vertical state profiles, while a conventional physics-based scheme provides clear-sky fluxes; CRI is defined as the difference between all-sky and clear-sky heating rates, . Training uses 5 km QUBICC data coarse-grained to target resolutions (~80 km) and is evaluated against a McICA-based pyRTE+RRTMGP baseline across shortwave and longwave spectra, including fully and partially cloudy conditions and across multiple regions. The results show substantial reductions in heating-rate errors (factors of 4–11) with the ML-enhanced scheme, demonstrating the potential to encode subgrid-cloud variability into radiation schemes for next-generation Earth System Models. The study also discusses limitations due to unresolved shallow convection and aerosol absence in high-resolution data, and outlines online deployment and aerosol integration as future directions, highlighting a path toward more accurate, generalizable radiation parameterizations with potential computational savings.

Abstract

Improvements of Machine Learning (ML)-based radiation emulators remain constrained by the underlying assumptions to represent horizontal and vertical subgrid-scale cloud distributions, which continue to introduce substantial uncertainties. In this study, we introduce a method to represent the impact of subgrid-scale clouds by applying ML to learn processes from high-resolution model output with a horizontal grid spacing of 5km. In global storm resolving models, clouds begin to be explicitly resolved. Coarse-graining these high-resolution simulations to the resolution of coarser Earth System Models yields radiative heating rates that implicitly include subgrid-scale cloud effects, without assumptions about their horizontal or vertical distributions. We define the cloud radiative impact as the difference between all-sky and clear-sky radiative fluxes, and train the ML component solely on this cloud-induced contribution to heating rates. The clear-sky tendencies remain being computed with a conventional physics-based radiation scheme. This hybrid design enhances generalization, since the machine-learned part addresses only subgrid-scale cloud effects, while the clear-sky component remains responsive to changes in greenhouse gas or aerosol concentrations. Applied to coarse-grained data offline, the ML-enhanced radiation scheme reduces errors by a factor of 4-10 compared with a conventional coarse-scale radiation scheme. This shows the potential of representing subgrid-scale cloud effects in radiation schemes with ML for the next generation of Earth System Models.

Paper Structure

This paper contains 7 sections, 3 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Sketch of constructing the cloud radiative impact on heating rates. Radiation schemes calculate fluxes for the same scene once with and once without clouds resulting in all-sky and clear-sky fluxes. The corresponding heating rates can be inferred from the fluxes and the residual yields an approximation of the cloud radiative impact on heating rates for all layers in a column.
  • Figure 2: Distributions of water related input variables for ICON-A and coarse-grained QUBICC data. The bold line shows the mean and the shaded area shows 95% of the spread between the 2.5% percentile and the 97.5% percentile. The boxplot is limited by the minimum and maximum values. The box edges are defined at the 25% and 75% percentile of the distribution. The black line illustrates the mean of the distribution and the star is the median.
  • Figure 3: The distribution of shortwave (top row) and longwave (bottom row) heating rates in coarse-scale and scaled coarse-grained data. The bold line shows the mean and the shaded area shows 95% of the spread, which is defined as the spread between the 2.5% percentile and the 97.5% percentile. The left column shows all-sky heating rates as it is used in the ICON model. The middle column shows clear-sky heating rate computed from clear-sky fluxes, which is a diagnostic output in the ICON model. The right column shows the cloud radiative impact on the heating rate which was computed by subtracting the clear-sky heating rate from the all-sky heating rate.
  • Figure 4: Comparison of the pyRTE and the hybrid ML-based radiation scheme on coarse-grained QUBICC data. Results are shown for the shortwave (top rows) and longwave (bottom rows) spectral range. The results are shown separately for clear-sky samples (no clouds, left column), fully cloudy sky samples (middle column) and samples with partial cloudiness (right column). The shown metrics are coefficient of determination $R^2$ (green), bias (orange) and MAE (blue) with 95% of the spread, which is defined as the spread between the 2.5% percentile and the 97.5% percentiles. The bias and MAE share the x-axis. The ML-clear-sky panels are gray because the clear-sky fluxes are not calculated by the ML model, but shown as reference.
  • Figure 5: As Figure \ref{['fig:06:comparison_by_clt']}, but for the selected regions shown in Figure 5 of Bock2024.
  • ...and 3 more figures