An ensemble of data-driven weather prediction models for operational sub-seasonal forecasting
Jonathan A. Weyn, Divya Kumar, Jeremy Berman, Najeeb Kazmi, Sylwester Klocek, Pete Luferenko, Kit Thambiratnam
TL;DR
This work presents an operational multi-model ensemble of data-driven weather prediction models coupled to an ocean model to deliver global forecasts at $1^{\circ}$ for up to $4$ weeks. By training five diverse data-driven architectures and using autoregressive forecasting with SST coupling, the authors demonstrate near-state-of-the-art subseasonal forecasts and show that probabilistic forecasts can be competitive with, and occasionally exceed, the ECMWF extended-range ensemble. They implement a principled hindcast-based bias correction and show that combining data-driven models with traditional NWP improves reliability and spread calibration. The approach highlights the practical potential of data-driven ensembles for operational S2S forecasting, while acknowledging limitations in extreme-event evaluation, precipitation skill, and the need for further methodological refinements.
Abstract
We present an operations-ready multi-model ensemble weather forecasting system which uses hybrid data-driven weather prediction models coupled with the European Centre for Medium-range Weather Forecasts (ECMWF) ocean model to predict global weather at 1-degree resolution for 4 weeks of lead time. For predictions of 2-meter temperature, our ensemble on average outperforms the raw ECMWF extended-range ensemble by 4-17%, depending on the lead time. However, after applying statistical bias corrections, the ECMWF ensemble is about 3% better at 4 weeks. For other surface parameters, our ensemble is also within a few percentage points of ECMWF's ensemble. We demonstrate that it is possible to achieve near-state-of-the-art subseasonal-to-seasonal forecasts using a multi-model ensembling approach with data-driven weather prediction models.
