Table of Contents
Fetching ...

Swift: An Autoregressive Consistency Model for Efficient Weather Forecasting

Jason Stock, Troy Arcomano, Rao Kotamarthi

TL;DR

Swift introduces a single-step autoregressive consistency model trained with a CRPS objective to enable fast, calibrated probabilistic weather forecasts across medium-range to seasonal scales. By combining TrigFlow-based diffusion-style pretraining with multi-step finetuning, it achieves 6-hourly skill that remains stable up to 75 days while offering up to orders-of-magnitude faster inference than diffusion baselines. The method yields forecasts competitive with operational IFS ENS, demonstrates strong long-term stability, and provides compelling case studies on extreme events and seasonal trends. This work showcases a practical path toward efficient, reliable ensemble forecasting without the maintenance burden of multi-model ensembles.

Abstract

Diffusion models offer a physically grounded framework for probabilistic weather forecasting, but their typical reliance on slow, iterative solvers during inference makes them impractical for subseasonal-to-seasonal (S2S) applications where long lead-times and domain-driven calibration are essential. To address this, we introduce Swift, a single-step consistency model that, for the first time, enables autoregressive finetuning of a probability flow model with a continuous ranked probability score (CRPS) objective. This eliminates the need for multi-model ensembling or parameter perturbations. Results show that Swift produces skillful 6-hourly forecasts that remain stable for up to 75 days, running $39\times$ faster than state-of-the-art diffusion baselines while achieving forecast skill competitive with the numerical-based, operational IFS ENS. This marks a step toward efficient and reliable ensemble forecasting from medium-range to seasonal-scales.

Swift: An Autoregressive Consistency Model for Efficient Weather Forecasting

TL;DR

Swift introduces a single-step autoregressive consistency model trained with a CRPS objective to enable fast, calibrated probabilistic weather forecasts across medium-range to seasonal scales. By combining TrigFlow-based diffusion-style pretraining with multi-step finetuning, it achieves 6-hourly skill that remains stable up to 75 days while offering up to orders-of-magnitude faster inference than diffusion baselines. The method yields forecasts competitive with operational IFS ENS, demonstrates strong long-term stability, and provides compelling case studies on extreme events and seasonal trends. This work showcases a practical path toward efficient, reliable ensemble forecasting without the maintenance burden of multi-model ensembles.

Abstract

Diffusion models offer a physically grounded framework for probabilistic weather forecasting, but their typical reliance on slow, iterative solvers during inference makes them impractical for subseasonal-to-seasonal (S2S) applications where long lead-times and domain-driven calibration are essential. To address this, we introduce Swift, a single-step consistency model that, for the first time, enables autoregressive finetuning of a probability flow model with a continuous ranked probability score (CRPS) objective. This eliminates the need for multi-model ensembling or parameter perturbations. Results show that Swift produces skillful 6-hourly forecasts that remain stable for up to 75 days, running faster than state-of-the-art diffusion baselines while achieving forecast skill competitive with the numerical-based, operational IFS ENS. This marks a step toward efficient and reliable ensemble forecasting from medium-range to seasonal-scales.

Paper Structure

This paper contains 22 sections, 7 equations, 15 figures, 1 table.

Figures (15)

  • Figure 1: Overview of our approach. (a) methodological flow diagram of generating a single-member rollout from noise and an updated conditional state; (b) 39$\times$ faster inference over diffusion baselines by using a single function evaluation with comparable skill to the IFS ENS; and (c) output over a subset of variables shown for a single-member forecast at 24h intervals that is 75 days into the future.
  • Figure 2: Our proposed network architecture.
  • Figure 3: Loss weights used during training. (left) pressure weights applied to atmospheric variables; (middle) surface level weighting to specific variables; and (right) clipped latitude weighting.
  • Figure 4: Learning curves for Swift. (left) learning rate schedule with Muon $\eta$ during pretraining and AdamW during finetuning (in red); and (right) training loss with a 3M tangent warmup (\ref{['eq:loss.scm.tangent']}, in blue) before multi-step finetuning with $K=1$--$8$ autoregressive steps from 15--20M images.
  • Figure 5: Global forecast skill (on a subset of initials with $\delta i=6$). (a) latitude-weighted ensemble RMSE and spread/skill compared to baselines; and (b) close view benefit of multi-step finetuning.
  • ...and 10 more figures