Table of Contents
Fetching ...

Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling

Jaideep Pathak, Yair Cohen, Piyush Garg, Peter Harrington, Noah Brenowitz, Dale Durran, Morteza Mardani, Arash Vahdat, Shaoming Xu, Karthik Kashinath, Michael Pritchard

TL;DR

This work introduces StormCast, a first-of-its-kind generative diffusion-based emulator for km-scale convection-allowing models (CAMs) that forecasts high-resolution atmospheric states with 1-hour steps, conditioned on synoptic fields. It learns the conditional distribution $p_{ heta}(M_{t+1} \mid S_t, M_t)$ through a two-phase process combining deterministic regression and stochastic diffusion, enabling autoregressive sampling and ensemble generation. Across case studies and ensemble experiments, StormCast shows competitive skill with HRRR for radar reflectivity up to about 6 hours, preserves physically consistent multivariate convective dynamics, and produces realistic power spectra and distributions. The results suggest that autoregressive generative ML can realistically emulate CAM dynamics, offering scalable ensembles and potential applications in regional weather prediction and climate downscaling, while highlighting areas for improvement in calibration and data requirements.

Abstract

Storm-scale convection-allowing models (CAMs) are an important tool for predicting the evolution of thunderstorms and mesoscale convective systems that result in damaging extreme weather. By explicitly resolving convective dynamics within the atmosphere they afford meteorologists the nuance needed to provide outlook on hazard. Deep learning models have thus far not proven skilful at km-scale atmospheric simulation, despite being competitive at coarser resolution with state-of-the-art global, medium-range weather forecasting. We present a generative diffusion model called StormCast, which emulates the high-resolution rapid refresh (HRRR) model-NOAA's state-of-the-art 3km operational CAM. StormCast autoregressively predicts 99 state variables at km scale using a 1-hour time step, with dense vertical resolution in the atmospheric boundary layer, conditioned on 26 synoptic variables. We present evidence of successfully learnt km-scale dynamics including competitive 1-6 hour forecast skill for composite radar reflectivity alongside physically realistic convective cluster evolution, moist updrafts, and cold pool morphology. StormCast predictions maintain realistic power spectra for multiple predicted variables across multi-hour forecasts. Together, these results establish the potential for autoregressive ML to emulate CAMs -- opening up new km-scale frontiers for regional ML weather prediction and future climate hazard dynamical downscaling.

Kilometer-Scale Convection Allowing Model Emulation using Generative Diffusion Modeling

TL;DR

This work introduces StormCast, a first-of-its-kind generative diffusion-based emulator for km-scale convection-allowing models (CAMs) that forecasts high-resolution atmospheric states with 1-hour steps, conditioned on synoptic fields. It learns the conditional distribution through a two-phase process combining deterministic regression and stochastic diffusion, enabling autoregressive sampling and ensemble generation. Across case studies and ensemble experiments, StormCast shows competitive skill with HRRR for radar reflectivity up to about 6 hours, preserves physically consistent multivariate convective dynamics, and produces realistic power spectra and distributions. The results suggest that autoregressive generative ML can realistically emulate CAM dynamics, offering scalable ensembles and potential applications in regional weather prediction and climate downscaling, while highlighting areas for improvement in calibration and data requirements.

Abstract

Storm-scale convection-allowing models (CAMs) are an important tool for predicting the evolution of thunderstorms and mesoscale convective systems that result in damaging extreme weather. By explicitly resolving convective dynamics within the atmosphere they afford meteorologists the nuance needed to provide outlook on hazard. Deep learning models have thus far not proven skilful at km-scale atmospheric simulation, despite being competitive at coarser resolution with state-of-the-art global, medium-range weather forecasting. We present a generative diffusion model called StormCast, which emulates the high-resolution rapid refresh (HRRR) model-NOAA's state-of-the-art 3km operational CAM. StormCast autoregressively predicts 99 state variables at km scale using a 1-hour time step, with dense vertical resolution in the atmospheric boundary layer, conditioned on 26 synoptic variables. We present evidence of successfully learnt km-scale dynamics including competitive 1-6 hour forecast skill for composite radar reflectivity alongside physically realistic convective cluster evolution, moist updrafts, and cold pool morphology. StormCast predictions maintain realistic power spectra for multiple predicted variables across multi-hour forecasts. Together, these results establish the potential for autoregressive ML to emulate CAMs -- opening up new km-scale frontiers for regional ML weather prediction and future climate hazard dynamical downscaling.
Paper Structure (15 sections, 6 equations, 18 figures, 1 table)

This paper contains 15 sections, 6 equations, 18 figures, 1 table.

Figures (18)

  • Figure 1: (a) StormCast generates an autoregressive forecast starting from an initial condition generated by HRRR analysis and using an hourly synoptic scale forecast produced by the GFS model. (b) Illustration of how StormCast generates a km-scale forecast in a two-step process. The synoptic-scale fields $S_t$ and the mesoscale fields $M_t$ at time $t$ are used to generate a one-hour ahead deterministic mean forecast $\mu_{t+1}$ using a neural network $F_\theta$ with a UNet architecture. The mean forecast $\mu_{t+1}$ and the $M_t$ are concatenated with a latent random gaussian noise vector and passed through a denoising diffusion model $D_\phi$ for a series of diffusion steps to sample an estimate of the residual forecast $r_{t+1}$ which is added to the mean $\mu_{t+1}$ to generate $M_{t+1}$ from the forecast distribution at time $t+1$. This process is repeated auto-regressively to generate a 12-hour forecast. The synoptic-scale conditioning $S_t$ at each time step $t$ is provided to the model via a 25km global model -- the NCEP GFS model in forecasting mode and ERA5 reanalysis in our hindcast tests. Panel (c) illustrates the stacked channels representing the synoptic-scale state $S_t$ on a pressure-level vertical grid and interpolated to the km-scale domain as well as the mesoscale state $M_t$ on the native HRRR model hybrid vertical grid. Refer to Tab. \ref{['tab:parameters']} for the full channel set. The spatial extent of the domain of operation of the StormCast model is illustrated in panel (d) with a blue bounding box. The domain size is $1536$km $\times$$1920$km
  • Figure 2: An example forecast of composite radar reflectivity generated by StormCast and HRRR compared with observed verification data from the Multi-Radar Multi-Sensor (MRMS) network. From left to right, the columns show the HRRR forecast, the corresponding MRMS observation, the Probability Matched Mean (PMM) of a five-member StormCast ensemble forecast and one of the StormCast ensemble members. The HRRR and StormCast forecasts were initialized at $2024$-$05$-$29$$12:00$ UTC. The StormCast forecasts were initialized using the HRRR analysis at the initialization time. The rows from top to bottom show the forecast at progressively longer lead times (1 hour, 3 hours, 6 hours and 12 hours) along with the corresponding MRMS observation at the appropriate time.
  • Figure 3: The Fractions Skill Score (FSS) of StormCast forecasts is compared with the corresponding FSS of HRRR forecasts. We compute the FSS at a few different pooling window sizes to illustrate the forecast skill at various spatial scales.
  • Figure 4: Forecast skill of a few selected variables predicted by the StormCast model compared with the HRRR model. Forecast skill is measured using the Root Mean Square Error between forecasts and the verification data for Composite Reflectivity, 10m wind velocity components, 2m temperature as well as the winds, temperature and specific humidity at HRRR native levels 5 and 10. The verification data for all variables except the composite reflectivity is the HRRR analysis at the verification time. For composite reflectivity, the verification data is the corresponding observed reflectivity from the MRMS sensor network at the verification time. The forecast RMSE scores are averaged over 130 forecasts from May 10 2024 to June 15 2024 with forecasts generated four times daily at 00, 06, 12, 18 UTC. We show the skill of the single member HRRR forecast (solid lines), a single member StormCast forecast (dotted lines), and the Probability Matched Mean (PMM) of a 5 member ensemble from StormCast (dashed lines).
  • Figure 5: Power Spectra of selected variables. Left column shows the power spectra at lead time 3 hours comparing the diffusion model (blue) and the regression model (orange) with the target. Each row in the right column shows the relative difference (nondimensional) from the target (model/target-1) corresponding with the variable in the left column, for a selection of lead times. The top row includes also the spectra from the HRRR forecast for the corresponding times.
  • ...and 13 more figures