Table of Contents
Fetching ...

AIFS-CRPS: Ensemble forecasting using a model trained with a loss function based on the Continuous Ranked Probability Score

Simon Lang, Mihai Alexe, Mariana C. A. Clare, Christopher Roberts, Rilwan Adewoyin, Zied Ben Bouallègue, Matthew Chantry, Jesper Dramsch, Peter D. Dueben, Sara Hahner, Pedro Maciel, Ana Prieto-Nemesio, Cathal O'Brien, Florian Pinault, Jan Polster, Baudouin Raoult, Steffen Tietsche, Martin Leutbecher

TL;DR

ECMWF introduces AIFS-CRPS, a probabilistic machine-learned ensemble forecast model trained with the almost fair CRPS (afCRPS) loss to generate exchangeable stochastic members. The approach combines a transformer-based encoder–processor–decoder with a four-stage training regime and rollout mitigation via reference-field downsampling, delivering realistic variability and improved skill for medium-range and subseasonal forecasts. Across O96 and N320 grid configurations, AIFS-CRPS generally outperforms the 9 km IFS ensemble for many upper-air and several surface variables, while maintaining cheap inference and scalable ensemble generation. Remaining challenges include stratospheric performance tied to loss scaling and initial-condition perturbations, with planned work to address these and extend observation-based training. The results suggest a promising path to real-time, probabilistic forecasts that leverage machine learning for calibrated uncertainty estimates.

Abstract

Over the last three decades, ensemble forecasts have become an integral part of forecasting the weather. They provide users with more complete information than single forecasts as they permit to estimate the probability of weather events by representing the sources of uncertainties and accounting for the day-to-day variability of error growth in the atmosphere. This paper presents a novel approach to obtain a weather forecast model for ensemble forecasting with machine-learning. AIFS-CRPS is a variant of the Artificial Intelligence Forecasting System (AIFS) developed at ECMWF. Its loss function is based on a proper score, the Continuous Ranked Probability Score (CRPS). For the loss, the almost fair CRPS is introduced because it approximately removes the bias in the score due to finite ensemble size yet avoids a degeneracy of the fair CRPS. The trained model is stochastic and can generate as many exchangeable members as desired and computationally feasible in inference. For medium-range forecasts AIFS-CRPS outperforms the physics-based Integrated Forecasting System (IFS) ensemble for the majority of variables and lead times. For subseasonal forecasts, AIFS-CRPS outperforms the IFS ensemble before calibration and is competitive with the IFS ensemble when forecasts are evaluated as anomalies to remove the influence of model biases.

AIFS-CRPS: Ensemble forecasting using a model trained with a loss function based on the Continuous Ranked Probability Score

TL;DR

ECMWF introduces AIFS-CRPS, a probabilistic machine-learned ensemble forecast model trained with the almost fair CRPS (afCRPS) loss to generate exchangeable stochastic members. The approach combines a transformer-based encoder–processor–decoder with a four-stage training regime and rollout mitigation via reference-field downsampling, delivering realistic variability and improved skill for medium-range and subseasonal forecasts. Across O96 and N320 grid configurations, AIFS-CRPS generally outperforms the 9 km IFS ensemble for many upper-air and several surface variables, while maintaining cheap inference and scalable ensemble generation. Remaining challenges include stratospheric performance tied to loss scaling and initial-condition perturbations, with planned work to address these and extend observation-based training. The results suggest a promising path to real-time, probabilistic forecasts that leverage machine learning for calibrated uncertainty estimates.

Abstract

Over the last three decades, ensemble forecasts have become an integral part of forecasting the weather. They provide users with more complete information than single forecasts as they permit to estimate the probability of weather events by representing the sources of uncertainties and accounting for the day-to-day variability of error growth in the atmosphere. This paper presents a novel approach to obtain a weather forecast model for ensemble forecasting with machine-learning. AIFS-CRPS is a variant of the Artificial Intelligence Forecasting System (AIFS) developed at ECMWF. Its loss function is based on a proper score, the Continuous Ranked Probability Score (CRPS). For the loss, the almost fair CRPS is introduced because it approximately removes the bias in the score due to finite ensemble size yet avoids a degeneracy of the fair CRPS. The trained model is stochastic and can generate as many exchangeable members as desired and computationally feasible in inference. For medium-range forecasts AIFS-CRPS outperforms the physics-based Integrated Forecasting System (IFS) ensemble for the majority of variables and lead times. For subseasonal forecasts, AIFS-CRPS outperforms the IFS ensemble before calibration and is competitive with the IFS ensemble when forecasts are evaluated as anomalies to remove the influence of model biases.

Paper Structure

This paper contains 15 sections, 5 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Probabilistic training of AIFS-CRPS. A small ensemble of atmospheric states is propagated forward in time using separate model instances (that share the same weights). With ensemble sharding (see section \ref{['sec:parallelism']}), the ensemble forecasts are then gathered across all participating GPU devices using a differentiable all-gather operation. Finally, the (almost fair) CRPS loss is calculated from the AIFS-CRPS forecast ensemble and a deterministic analysis (e.g., ERA5) target.
  • Figure 2: 24-hr (left column) and 240-hr (right column) forecasts of meridional wind at 850 hPa, from perturbed member 1 of the IFS 9 km ensemble (approximately 0.1$\degree$ spatial resolution, \ref{['fig:24h_ifs']}, \ref{['fig:240h_ifs']}) AIFS trained with a MSE loss (approximately 0.25$\degree$ spatial resolution; \ref{['fig:24h_aifs']}, \ref{['fig:240h_aifs']}), perturbed member 1 of the AIFS-CRPS N320 ensemble (approximately 0.25$\degree$ spatial resolution; \ref{['fig:24h_aifs-kcrps-n320']}, \ref{['fig:240h_aifs-kcrps-n320']}) and of the AIFS-CRPS O96 ensemble (approximately 1.0$\degree$ spatial resolution; \ref{['fig:24h_aifs-kcrps-o96']}, \ref{['fig:240h_aifs-kcrps-o96']}). The forecasts are initialised on March 1st 2024, 00 UTC. For plotting, the fields have been interpolated to a regular 0.25$\degree$ latitude-longitude grid.
  • Figure 3: Geopotential at 500 hPa of a 300-hour forecast from perturbed member 1 of the AIFS-CRPS ensemble when the model generates a tendency with respect to the full resolution input (reference) field (\ref{['fig:noise1']}) compared to when the tendency is generated with respect to a truncated input (reference) field (\ref{['fig:noise2']}). For more explanation, see section \ref{['sec:erroraccum']}.
  • Figure 4: Spectra of geopotential at 500 hPa (\ref{['fig:spc_z_1']}, \ref{['fig:spc_z_2']}) and temperature at 850 hPa (\ref{['fig:spc_t_2']}, \ref{['fig:comp_spc']}) for different lead times. Step 0 h refer to the initial conditions / IFS analysis. Shown are the AIFS-CRPS ensemble without (\ref{['fig:spc_z_1']}) and with reference field truncation (\ref{['fig:spc_z_2']}, \ref{['fig:spc_t_2']}, \ref{['fig:comp_spc']}), and AIFS (\ref{['fig:comp_spc']}). Spectra are averaged over 12 initial dates and the first 8 ensemble members (\ref{['fig:spc_z_1']}, \ref{['fig:spc_z_2']} and \ref{['fig:spc_t_2']}). For the AIFS and AIFS-CRPS comparison (\ref{['fig:comp_spc']}), the spectra are averaged over 12 initial dates and AIFS-CRPS perturbed member 1 only. For more explanation, please see the text.
  • Figure 5: AIFS-CRPS N320 (blue, solid line) and IFS ensemble (green, dashed line) CRPS of 2 m temperature for different lead times in the nothern extra-tropics verified against SYNOP observations (\ref{['fig:crps_2tsfcob_n']}), temperature at 850 hPa, northern extra-tropics (\ref{['fig:crps_t850pl_n']}) and Tropics (\ref{['fig:crps_t850pl_tropics']}) verified against analyses. Scores are averaged over the period 1 February to 30 September 2024, with forecasts initialised at 00 and 12 UTC.
  • ...and 8 more figures