Table of Contents
Fetching ...

CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling

Junchao Gong, Lei Bai, Peng Ye, Wanghan Xu, Na Liu, Jianhua Dai, Xiaokang Yang, Wanli Ouyang

TL;DR

CasCast addresses the challenge of skillful, high-resolution precipitation nowcasting by decoupling the prediction of mesoscale, deterministic precipitation motion from the generation of small-scale, stochastic patterns. It advances a cascaded approach where a deterministic component predicts the global distribution and a diffusion-based probabilistic component, operating in a latent space with a frame-wise guided diffusion transformer (CasFormer), generates high-resolution details conditioned on past context. This combination yields strong performance across three radar datasets, with notable gains for regional extreme precipitation and reduced computational cost relative to full high-resolution diffusion. The work demonstrates the value of separating scale-specific dynamics and using frame-wise latent-space diffusion to deliver reliable, high-resolution nowcasts with practical applicability in disaster management and urban planning.

Abstract

Precipitation nowcasting based on radar data plays a crucial role in extreme weather prediction and has broad implications for disaster management. Despite progresses have been made based on deep learning, two key challenges of precipitation nowcasting are not well-solved: (i) the modeling of complex precipitation system evolutions with different scales, and (ii) accurate forecasts for extreme precipitation. In this work, we propose CasCast, a cascaded framework composed of a deterministic and a probabilistic part to decouple the predictions for mesoscale precipitation distributions and small-scale patterns. Then, we explore training the cascaded framework at the high resolution and conducting the probabilistic modeling in a low dimensional latent space with a frame-wise-guided diffusion transformer for enhancing the optimization of extreme events while reducing computational costs. Extensive experiments on three benchmark radar precipitation datasets show that CasCast achieves competitive performance. Especially, CasCast significantly surpasses the baseline (up to +91.8%) for regional extreme-precipitation nowcasting.

CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling

TL;DR

CasCast addresses the challenge of skillful, high-resolution precipitation nowcasting by decoupling the prediction of mesoscale, deterministic precipitation motion from the generation of small-scale, stochastic patterns. It advances a cascaded approach where a deterministic component predicts the global distribution and a diffusion-based probabilistic component, operating in a latent space with a frame-wise guided diffusion transformer (CasFormer), generates high-resolution details conditioned on past context. This combination yields strong performance across three radar datasets, with notable gains for regional extreme precipitation and reduced computational cost relative to full high-resolution diffusion. The work demonstrates the value of separating scale-specific dynamics and using frame-wise latent-space diffusion to deliver reliable, high-resolution nowcasts with practical applicability in disaster management and urban planning.

Abstract

Precipitation nowcasting based on radar data plays a crucial role in extreme weather prediction and has broad implications for disaster management. Despite progresses have been made based on deep learning, two key challenges of precipitation nowcasting are not well-solved: (i) the modeling of complex precipitation system evolutions with different scales, and (ii) accurate forecasts for extreme precipitation. In this work, we propose CasCast, a cascaded framework composed of a deterministic and a probabilistic part to decouple the predictions for mesoscale precipitation distributions and small-scale patterns. Then, we explore training the cascaded framework at the high resolution and conducting the probabilistic modeling in a low dimensional latent space with a frame-wise-guided diffusion transformer for enhancing the optimization of extreme events while reducing computational costs. Extensive experiments on three benchmark radar precipitation datasets show that CasCast achieves competitive performance. Especially, CasCast significantly surpasses the baseline (up to +91.8%) for regional extreme-precipitation nowcasting.
Paper Structure (25 sections, 8 equations, 16 figures, 4 tables)

This paper contains 25 sections, 8 equations, 16 figures, 4 tables.

Figures (16)

  • Figure 1: Different precipitation nowcasting pipelines. Given context $x$, the prediction of deterministic models and probabilistic models are $y^{\prime}$ and $y^{\prime\prime}$, respectively. Our CasCast generates $y^{\prime\prime}$ conditional on $x$ and $y^{\prime}$ in the latent space.
  • Figure 2: Prediction visualization of different methods for a lead time of 60 minutes. The prediction of EarthFormer gao2022earthformer lacks small-scale patterns. The prediction of PreDiff gao2023prediff has a lower regional extreme value. In contrast, our CasCast method effectively captures both local patterns and regional extreme values.
  • Figure 3: Left: Overview of our CasCast. First, CasCast employs a deterministic model in pixel space to generate the blur prediction $y^{\prime}_{T:T^{\prime}}$ from previous observations $x_{0:T}$. Then, $x_{0:T}$ and $y^{\prime}_{T:T^{\prime}}$ are encoded into latent representations $\mathrm{E}(x_{0:T})$ and $\mathrm{E}(y^{\prime}_{T:T^{\prime}})$ by a pretrained frame-wise encoder $\mathrm{E}$. Last, conditioned on $\mathrm{E}(x_{0:T})$ and $\mathrm{E}(y^{\prime}_{T:T^{\prime}})$, the final prediction is generated through the diffusion denoise process on a novel CasFormer, and decoded back to the pixel space by a pretrained decoder. Right: Illustration of our CasFormer. First, $\mathrm{E}(y^{\prime}_{T:T^{\prime}})$ and the latent vector $z_{k}$ are split into framewise inputs and processed by patch embedding and $\frac{L}{2}$ layers of diffusion attention block. Then, frame-wise features $h^{0}_{f} \dots h^{T^{\prime}-T}_{f}$ and $\mathrm{E}(x_{0:T})$ are combined to the sequence-wise feature $h_s$ via a sequence aggregator, and used to predict the latent vector $z_{k-1}$.
  • Figure 4: HSS scores and CSI scores of autoencoders with different latent dimensions. Note that, $16, 74, 133, 160, 181, 219$ are different thresholds applied for computing the scores.
  • Figure 5: A set of example forecasts. From top to down denote Target, ConvLSTM, SimVP, LDM, PreDiff and CasCast (EarthFormer). From left to right denote forecasts of different lead times. (More qualitative results are shown in the Appendix.)
  • ...and 11 more figures