Precipitation Downscaling with Spatiotemporal Video Diffusion
Prakhar Srivastava, Ruihan Yang, Gavin Kerrigan, Gideon Dresdner, Jeremy McGibbon, Christopher Bretherton, Stephan Mandt
TL;DR
This work tackles the challenge of high-resolution precipitation downscaling by modeling the full conditional distribution of fine-scale rainfall given coarse-grid inputs. It introduces SpatioTemporal Video Diffusion (STVD), a two-stage framework that deterministically downsamples via a spatio-temporal UNet and then adds stochastic, multimodal details through a conditional diffusion model conditioned on both the low-resolution sequence and the downscaled mean. The approach outperforms six strong baselines across multiple metrics (MSE, CRPS, EMD, PE, SAE) on FV3GFS-derived data, and ablations demonstrate the importance of temporal context and additional climate inputs. By preserving extreme-event statistics and fine-scale spatial structure, STVD offers a practical, probabilistic path to downscaling that can support climate risk assessment and regional planning under limited computational budgets.
Abstract
In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods. Statistical downscaling, or super-resolution, is a common workaround where a low-resolution prediction is improved using statistical approaches. Unlike traditional computer vision tasks, weather and climate applications require capturing the accurate conditional distribution of high-resolution given low-resolution patterns to assure reliable ensemble averages and unbiased estimates of extreme events, such as heavy rain. This work extends recent video diffusion models to precipitation super-resolution, employing a deterministic downscaler followed by a temporally-conditioned diffusion model to capture noise characteristics and high-frequency patterns. We test our approach on FV3GFS output, an established large-scale global atmosphere model, and compare it against six state-of-the-art baselines. Our analysis, capturing CRPS, MSE, precipitation distributions, and qualitative aspects using California and the Himalayas as examples, establishes our method as a new standard for data-driven precipitation downscaling.
