Table of Contents
Fetching ...

Physically Constrained Generative Adversarial Networks for Improving Precipitation Fields from Earth System Models

Philipp Hess, Markus Drüke, Stefan Petri, Felix M. Strnad, Niklas Boers

TL;DR

The paper addresses the challenge of biases in precipitation from low‑resolution Earth system models by introducing a physically constrained CycleGAN that operates on unpaired data to jointly improve temporal distributions and spatial structure while preserving the global precipitation sum. It demonstrates superior correction of temporal biases, notably removing the double ITCZ, and reproduces realistic spatial intermittency and high‑frequency structure, outperforming quantile mapping and rivaling CMIP6 outputs in key metrics. The global-sum constraint enables generalization to non‑stationary, future climate states (eg SSP5‑8.5), and interpretability via SmoothGrad identifies geographically coherent bias regions, particularly in the tropical Pacific. The approach offers a computationally efficient route to realistic precipitation fields, enabling large ensemble studies and integration with other Earth system components at a fraction of the cost of full high‑resolution models.

Abstract

Precipitation results from complex processes across many scales, making its accurate simulation in Earth system models (ESMs) challenging. Existing post-processing methods can improve ESM simulations locally, but cannot correct errors in modelled spatial patterns. Here we propose a framework based on physically constrained generative adversarial networks (GANs) to improve local distributions and spatial structure simultaneously. We apply our approach to the computationally efficient ESM CM2Mc-LPJmL. Our method outperforms existing ones in correcting local distributions, and leads to strongly improved spatial patterns especially regarding the intermittency of daily precipitation. Notably, a double-peaked Intertropical Convergence Zone, a common problem in ESMs, is removed. Enforcing a physical constraint to preserve global precipitation sums, the GAN can generalize to future climate scenarios unseen during training. Feature attribution shows that the GAN identifies regions where the ESM exhibits strong biases. Our method constitutes a general framework for correcting ESM variables and enables realistic simulations at a fraction of the computational costs.

Physically Constrained Generative Adversarial Networks for Improving Precipitation Fields from Earth System Models

TL;DR

The paper addresses the challenge of biases in precipitation from low‑resolution Earth system models by introducing a physically constrained CycleGAN that operates on unpaired data to jointly improve temporal distributions and spatial structure while preserving the global precipitation sum. It demonstrates superior correction of temporal biases, notably removing the double ITCZ, and reproduces realistic spatial intermittency and high‑frequency structure, outperforming quantile mapping and rivaling CMIP6 outputs in key metrics. The global-sum constraint enables generalization to non‑stationary, future climate states (eg SSP5‑8.5), and interpretability via SmoothGrad identifies geographically coherent bias regions, particularly in the tropical Pacific. The approach offers a computationally efficient route to realistic precipitation fields, enabling large ensemble studies and integration with other Earth system components at a fraction of the cost of full high‑resolution models.

Abstract

Precipitation results from complex processes across many scales, making its accurate simulation in Earth system models (ESMs) challenging. Existing post-processing methods can improve ESM simulations locally, but cannot correct errors in modelled spatial patterns. Here we propose a framework based on physically constrained generative adversarial networks (GANs) to improve local distributions and spatial structure simultaneously. We apply our approach to the computationally efficient ESM CM2Mc-LPJmL. Our method outperforms existing ones in correcting local distributions, and leads to strongly improved spatial patterns especially regarding the intermittency of daily precipitation. Notably, a double-peaked Intertropical Convergence Zone, a common problem in ESMs, is removed. Enforcing a physical constraint to preserve global precipitation sums, the GAN can generalize to future climate scenarios unseen during training. Feature attribution shows that the GAN identifies regions where the ESM exhibits strong biases. Our method constitutes a general framework for correcting ESM variables and enables realistic simulations at a fraction of the computational costs.
Paper Structure (7 sections, 13 equations, 5 figures, 1 table)

This paper contains 7 sections, 13 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Schematic of the CycleGAN model, showing the two generator-discriminator pairs that learn to translate samples from the ESM simulations to the ERA5 reanalysis (grey) and vice versa (yellow). Training the two generators to learn inverse mappings of each other allows to enforce cycle-consistency in the translation of the unpaired samples, i.e. $x \rightarrow G(x) \rightarrow F(G(x)) \rightarrow \tilde{x} \approx x$ and vice versa for $y$. As described by zhu2017unpaired, the cycle-consistency loss (Eq. \ref{['eq:cycle_loss']}) is motivated from natural language translation, where one should arrive at the same sentence after translating it into another language and back. In the training context, this has been found to improve the stability and to prevent typical problems in adversarial networks, such as mode collapse, where every input would be mapped to the same output image zhu2017unpaired.
  • Figure 2: Comparison of global mean error maps over the JJA season, long-term precipitation statistics based on latitude-profiles and relative frequency histograms. Mean errors of (a) CM2Mc-LPJmL, (b) GFDL-ESM4, (c) QM-based and (d) GAN-based post-processing methods applied to the CM2Mc-LPJmL output. The mean error is computed with respect to the ERA5 reanalysis data. The largest errors are in the tropics, where also the largest mean precipitation values are observed (see panel (e)). The GAN shows the largest error reduction, strongly reducing the double-peaked ITCZ in the tropics. Quantile mapping, on the other hand, is not able to remove the ITCZ bias. See Figs. S1--S4 for corresponding figures for annual time series, as well as the other three seasons. (e) Precipitation rates averaged over time and longitudes and relative frequency histograms (f) are shown for ERA5 data (black), CM2Mc-LPJmL (red), GFDL-ESM4 (blue), quantile mapping (magenta) and the GAN (cyan). The GAN applied to the CM2Mc-LPJmL output corrects the double-peaked ITCZ as well as the histogram over the entire range of precipitation rates.
  • Figure 3: Qualitative and quantitaive comparison of the intermittency in daily precipitation above 1 mm/day, on the same date (25th December 2014), for the (a) ERA5 reanalysis, (b) CM2Mc-LPJmL model, (c) GAN-based and (d) QM-based post-processing. The CM2Mc-LPJmL precipitation field (b) corresponds to an input of the GAN-generator which transforms it into the field shown in panel (c). The discriminator network then classifies whether the GAN output (c) or the ERA5 field (a) was generated artificially. Visually, the GAN substantially improves the spatial intermittency seen in ERA5, whereas applying QM does not lead to improved intermittency. Note that the modelled fields are not expected to be point-wise similar to the ERA5 'ground truth' (a), since these are time slices from climate projection runs. (e) The spatial power spectral density (PSD) of the different precipitation fields, averaged radially in space and over time. For ERA5 reanalysis (black), CM2Mc-LPJmL (red), GFDL-ESM4 (blue), quantile mapping (magenta) and the GAN (cyan). Note that only GAN-based post-processing of the CM2Mc-LPJmL model yields an accurate PSD across all spatial scales.
  • Figure 4: Large-scale trends as a three year rolling-mean of monthly and spatially average precipitation for the CMIP6 SSP5-8.5 scenario. For (a) global data, (b) the tropics and (c) temperate zone, of the CM2Mc-LPJmL (red crosses) and GFLD-ESM4 (blue) models, as well as the constrained (cyan) and unconstrained (brown) GANs. Only by adding the physical constrained to preserve the global precipitation amount per timestep enables the GAN (cyan) to follow the transient dynamics of the non-stationary climate scenario.
  • Figure 5: Annual average of daily precipitation fields from CM2Mc-LPJmL (color shading with scale according to the colorbar on the left) together with attribution maps (contour lines with color scale according to colorbar on the right). Note that we applied a Gaussian filter to the attribution maps to further reduce the noise. A standard deviation $\sigma=1.5$ for the filter was found to give robust results. The pacific region in the tropics shows the highest annual mean precipitation, and also the highest feature importance. The same region also exhibits the largest bias of CM2Mc-LPJmL, see in Fig. \ref{['fig:temporal_bias']}. Note that especially the double-ITCZ bias is a common and long-standing problem in the precipitation output of many general circulation models tian2020double.