Machine learning emulation of precipitation from km-scale regional climate simulations using a diffusion model
Henry Addison, Elizabeth Kendon, Suman Ravuri, Laurence Aitchison, Peter AG Watson
TL;DR
CPMGEM introduces a diffusion-model emulator to reproduce daily-mean precipitation at $8.8$ km from $60$ km GCM inputs, using the UK CPM as training data. The approach yields realistic spatial structure and extreme-event representations while offering orders-of-magnitude faster samples than running a km-scale CPM. The emulator demonstrates transferability to GCM inputs, captures the 21st-century climate change signal in many aspects (notably summer), and remains effective even with limited training data. This method enables large-ensemble, high-resolution rainfall projections across multiple GCMs and scenarios, with potential applications in flood risk, adaptation planning, and uncertainty quantification.
Abstract
High-resolution climate simulations are valuable for understanding climate change impacts. This has motivated use of regional convection-permitting climate models (CPMs), but these are very computationally expensive. We present a convection-permitting model generative emulator (CPMGEM), to skilfully emulate precipitation simulations by a 2.2km-resolution regional CPM at much lower cost. This utilises a generative machine learning approach, a diffusion model. It takes inputs at the 60km resolution of the driving global climate model and downscales these to 8.8km, with daily-mean time resolution, capturing the effect of convective processes represented in the CPM at these scales. The emulator is trained on simulations over England and Wales from the United Kingdom Climate Projections Local product, covering years between 1980 and 2080 following a high emissions scenario. The output precipitation has a similarly realistic spatial structure and intensity distribution to the CPM simulations. The emulator is stochastic, which improves the realism of samples. We show evidence that the emulator has skill for extreme events with ~100 year return times. It captures the main features of the simulated 21st century climate change, but exhibits some error in the magnitude. We demonstrate successful transfer from a "perfect model" training setting to application using GCM variable inputs. We also show that the method can be useful in situations with limited amounts of high-resolution data. Potential applications include producing high-resolution precipitation predictions for large-ensemble climate simulations and producing output based on different GCMs and climate change scenarios to better sample uncertainty.
