Controllable Probabilistic Forecasting with Stochastic Decomposition Layers
John S. Schreck, William E. Chapman, Charlie Becker, David John Gagne, Dhamma Kimpara, Nihanth Cherukuru, Judith Berner, Kirsten J. Mayer, Negin Sobhani
TL;DR
SDL introduces Stochastic Decomposition Layers to convert deterministic weather models into calibrated probabilistic ensembles with hierarchical, scale-aware perturbations. Built on WXFormer, the approach uses latent-driven style, channel modulation, and per-pixel noise, trained via CRPS to achieve competitive skill with low computational overhead and enable post-inference spread control through latent rescaling. The method supports reproducible ensemble generation and interpretable, multi-scale uncertainty decomposition, demonstrated on ERA5 data with favorable calibration metrics. Limitations include dependence on ERA5-derived uncertainty bounds and vertical resolution constraints, suggesting future work on out-of-distribution robustness and integration with physics-based stochastic parameterization.
Abstract
AI weather prediction ensembles with latent noise injection and optimized with the continuous ranked probability score (CRPS) have produced both accurate and well-calibrated predictions with far less computational cost compared with diffusion-based methods. However, current CRPS ensemble approaches vary in their training strategies and noise injection mechanisms, with most injecting noise globally throughout the network via conditional normalization. This structure increases training expense and limits the physical interpretability of the stochastic perturbations. We introduce Stochastic Decomposition Layers (SDL) for converting deterministic machine learning weather models into probabilistic ensemble systems. Adapted from StyleGAN's hierarchical noise injection, SDL applies learned perturbations at three decoder scales through latent-driven modulation, per-pixel noise, and channel scaling. When applied to WXFormer via transfer learning, SDL requires less than 2\% of the computational cost needed to train the baseline model. Each ensemble member is generated from a compact latent tensor (5 MB), enabling perfect reproducibility and post-inference spread adjustment through latent rescaling. Evaluation on 2022 ERA5 reanalysis shows ensembles with spread-skill ratios approaching unity and rank histograms that progressively flatten toward uniformity through medium-range forecasts, achieving calibration competitive with operational IFS-ENS. Multi-scale experiments reveal hierarchical uncertainty: coarse layers modulate synoptic patterns while fine layers control mesoscale variability. The explicit latent parameterization provides interpretable uncertainty quantification for operational forecasting and climate applications.
