Skillful Nowcasting of Convective Clouds With a Cascade Diffusion Model
Haoming Chen, Xiaohui Zhong, Qiang Zhai, Xiaomeng Li, Ying Wa Chan, Pak Wai Chan, Yuanyuan Huang, Hao Li, Xiaoming Shi
TL;DR
This work tackles the challenge of nowcasting convective clouds from satellite data by proposing SATcast, a diffusion-based cascade model conditioned on FuXi-predicted atmospheric fields and historical FY-4A imagery. The two-phase autoregressive cascade (predicting $T+1$ to $T+4$ in phase 1 and $T+5$ to $T+8$ in phase 2, with further extension by reusing outputs) mitigates error accumulation and extends skill to $24$ hours, outperforming persistence and optical flow baselines on a large test set. Ablation studies confirm the importance of multimodal conditioning, cascade structure, and fine-tuning, while permutation-importance analyses reveal the evolving contribution of satellite observations and FuXi variables over lead times. The model demonstrates strong generalization across channels and shows promise for operational nowcasting in data-sparse regions, potentially enabling probabilistic forecasts with future extensions.
Abstract
Accurate nowcasting of convective clouds from satellite imagery is essential for mitigating the impacts of meteorological disasters, especially in developing countries and remote regions with limited ground-based observations. Recent advances in deep learning have shown promise in video prediction; however, existing models frequently produce blurry results and exhibit reduced accuracy when forecasting physical fields. Here, we introduce SATcast, a diffusion model that leverages a cascade architecture and multimodal inputs for nowcasting cloud fields in satellite imagery. SATcast incorporates physical fields predicted by FuXi, a deep-learning weather model, alongside past satellite observations as conditional inputs to generate high-quality future cloud fields. Through comprehensive evaluation, SATcast outperforms conventional methods on multiple metrics, demonstrating its superior accuracy and robustness. Ablation studies underscore the importance of its multimodal design and the cascade architecture in achieving reliable predictions. Notably, SATcast maintains predictive skill for up to 24 hours, underscoring its potential for operational nowcasting applications.
