Table of Contents
Fetching ...

GraphDOP: Towards skilful data-driven medium-range weather forecasts learnt and initialised directly from observations

Mihai Alexe, Eulalie Boucher, Peter Lean, Ewan Pinnington, Patrick Laloyaux, Anthony McNally, Simon Lang, Matthew Chantry, Chris Burrows, Marcin Chrust, Florian Pinault, Ethel Villeneuve, Niels Bormann, Sean Healy

TL;DR

GraphDOP presents a novel, end-to-end data-driven forecasting framework trained and initialised exclusively from Earth System observations, avoiding gridded reanalysis inputs. It combines a graph neural encoder–processor–decoder with a transformer-based latent-space predictor on a dense latent grid, learning the relationships between satellite radiances and geophysical variables to forecast up to five days. Quantitative verification shows competitive short-range skill in observation space relative to the physics-based IFS, notably a 15% RMS improvement at 24h in the Tropics and small tropical biases at day 5, while grid-space results are generally robust up to day five and improvements over climatology are evident. The results, including sea-ice radiance and hurricane cases, demonstrate the potential ofAI-DOP to synthesize heterogeneous observations into a coherent Earth System representation, offering on-demand forecasts with no background state and guiding future hybrid and probabilistic developments.

Abstract

We introduce GraphDOP, a new data-driven, end-to-end forecast system developed at the European Centre for Medium-Range Weather Forecasts (ECMWF) that is trained and initialised exclusively from Earth System observations, with no physics-based (re)analysis inputs or feedbacks. GraphDOP learns the correlations between observed quantities - such as brightness temperatures from polar orbiters and geostationary satellites - and geophysical quantities of interest (that are measured by conventional observations), to form a coherent latent representation of Earth System state dynamics and physical processes, and is capable of producing skilful predictions of relevant weather parameters up to five days into the future.

GraphDOP: Towards skilful data-driven medium-range weather forecasts learnt and initialised directly from observations

TL;DR

GraphDOP presents a novel, end-to-end data-driven forecasting framework trained and initialised exclusively from Earth System observations, avoiding gridded reanalysis inputs. It combines a graph neural encoder–processor–decoder with a transformer-based latent-space predictor on a dense latent grid, learning the relationships between satellite radiances and geophysical variables to forecast up to five days. Quantitative verification shows competitive short-range skill in observation space relative to the physics-based IFS, notably a 15% RMS improvement at 24h in the Tropics and small tropical biases at day 5, while grid-space results are generally robust up to day five and improvements over climatology are evident. The results, including sea-ice radiance and hurricane cases, demonstrate the potential ofAI-DOP to synthesize heterogeneous observations into a coherent Earth System representation, offering on-demand forecasts with no background state and guiding future hybrid and probabilistic developments.

Abstract

We introduce GraphDOP, a new data-driven, end-to-end forecast system developed at the European Centre for Medium-Range Weather Forecasts (ECMWF) that is trained and initialised exclusively from Earth System observations, with no physics-based (re)analysis inputs or feedbacks. GraphDOP learns the correlations between observed quantities - such as brightness temperatures from polar orbiters and geostationary satellites - and geophysical quantities of interest (that are measured by conventional observations), to form a coherent latent representation of Earth System state dynamics and physical processes, and is capable of producing skilful predictions of relevant weather parameters up to five days into the future.

Paper Structure

This paper contains 13 sections, 1 equation, 15 figures, 1 table.

Figures (15)

  • Figure 1: Examples of data coverage from different observation types in a 12 hour window starting at 21 UTC on January 1st, 2021. NOAA-20 ATMS channel 5 (upper left), Meteosat 11 SEVIRI channel 5 (upper right), in-situ 2m temperature observations (lower left) and in-situ upper air observations (lower right).
  • Figure 2: A summary and timeline of the observation types currently included in the training dataset. This comprises both in-situ conventional data (from, e.g., surface stations and weather balloons) and Level-1 satellite observations from several instruments, including from geostationary and polar orbiters. Satellite observations are generally indicated by satellite names and instrument names; see the Appendix for a full list. Colours indicate the number of reports per day (each report may contain multiple observed variables or satellite channels). The period used for training the model described in this paper is marked by vertical dashed lines.
  • Figure 3: A schematic representation of the GraphDOP model. In this illustration, the model receives two 12-hour observation data windows that are embedded sequentially into a latent space representation using graph attention lang2024aifs. Multi-layer perceptrons (MLPs) are used to project multi-channel observations from individual instruments to a common feature dimension. The MLP weights are shared across all input windows. One or more invocations of the backbone module (rollout in latent space) advance the state throughout the 12-hour target observation window. A graph transformer decoder calculates intermediate representations of the observations at the target locations; finally, instrument-specific decoder MLPs produce the forecasts. Optionally, the forecasts can be fed back as the input to the next forecast window (rollout in observation space, dashed gray lines).
  • Figure 4: IASI channel 921 (wavenumber 875.0 $cm^{-1}$) brightness temperatures (K): forecasted (left), observed (middle), and difference (observed minus forecast; right). We show 12-hour samples, starting from forecast day 1 (Jan 7, 2023; top row) through to day 4 (Jan 11, 2023; bottom row). The global forecast RMSE for the 12 hour sample is printed in the top left corner. Blue shades are indicative of "cold" features such as clouds, while red shades correspond to "warm" features, e.g., warm surface areas unobscured by clouds.
  • Figure 5: Gridded forecasts at a lead time of 24 hours, valid on Jan 15, 2023, 12z (right) compared to the ERA5 reanalysis (middle). Right panels show the difference (reanalysis - forecast), with the global forecast RMSE printed at the top right corner. In the bottom left and centre panels, the pixels where surface pressure is below 850 hPa are masked out (coloured white).
  • ...and 10 more figures