Diffusion models for probabilistic precipitation generation from atmospheric variables
Michael Aich, Sebastian Bathiany, Philipp Hess, Yu Huang, Niklas Boers
TL;DR
The paper addresses biases and resolution limits in traditional precipitation parameterizations by introducing a two-stage, data-driven framework that learns high-resolution precipitation from large-scale atmospheric variables. It combines a deterministic UNet regression at 1-degree resolution with a conditional diffusion model that generates 0.25-degree ensembles, trained exclusively on ERA5 to enable application to arbitrary ESMs. The approach substantially reduces spatial biases, improves statistics and extremes, and preserves large-scale climate trends under future scenarios while enabling fast, probabilistic projections. This method provides a computationally efficient, model-agnostic downscaling and bias-correction tool with significant potential for integration into climate modeling workflows.
Abstract
Improving the representation of precipitation in Earth system models (ESMs) is critical for assessing the impacts of climate change and especially of extreme events like floods and droughts. In existing ESMs, precipitation is not resolved explicitly, but represented by parameterizations. These typically rely on resolving approximated but computationally expensive column-based physics, not accounting for interactions between locations. They struggle to capture fine-scale precipitation processes and introduce significant biases. We present a novel approach, based on generative machine learning, which integrates a conditional diffusion model with a UNet architecture to generate accurate, high-resolution (0.25°) global daily precipitation fields from a small set of prognostic atmospheric variables. Unlike traditional parameterizations, our framework efficiently produces ensemble predictions, capturing uncertainties in precipitation, and does not require fine-tuning by hand. We train our model on the ERA5 reanalysis and present a method that allows us to apply it to arbitrary ESM data, enabling fast generation of probabilistic forecasts and climate scenarios. By leveraging interactions between global prognostic variables, our approach provides an alternative parameterization scheme that mitigates biases present in the ESM precipitation while maintaining consistency with its large-scale (annual) trends. This work demonstrates that complex precipitation patterns can be learned directly from large-scale atmospheric variables, offering a computationally efficient alternative to conventional schemes.
