Table of Contents
Fetching ...

Super-Resolving Coarse-Resolution Weather Forecasts With Flow Matching

Aymeric Delefosse, Anastase Charantonis, Dominique Béréziat

Abstract

Machine learning-based weather forecasting models now surpass state-of-the-art numerical weather prediction systems, but training and operating these models at high spatial resolution remains computationally expensive. We present a modular framework that decouples forecasting from spatial resolution by applying learned generative super-resolution as a post-processing step to coarse-resolution forecast trajectories. We formulate super-resolution as a stochastic inverse problem, using a residual formulation to preserve large-scale structure while reconstructing unresolved variability. The model is trained with flow matching exclusively on reanalysis data and is applied to global medium-range forecasts. We evaluate (i) design consistency by re-coarsening super-resolved forecasts and comparing them to the original coarse trajectories, and (ii) high-resolution forecast quality using standard ensemble verification metrics and spectral diagnostics. Results show that super-resolution preserves large-scale structure and variance after re-coarsening, introduces physically consistent small-scale variability, and achieves competitive probabilistic forecast skill at 0.25° resolution relative to an operational ensemble baseline, while requiring only a modest additional training cost compared with end-to-end high-resolution forecasting.

Super-Resolving Coarse-Resolution Weather Forecasts With Flow Matching

Abstract

Machine learning-based weather forecasting models now surpass state-of-the-art numerical weather prediction systems, but training and operating these models at high spatial resolution remains computationally expensive. We present a modular framework that decouples forecasting from spatial resolution by applying learned generative super-resolution as a post-processing step to coarse-resolution forecast trajectories. We formulate super-resolution as a stochastic inverse problem, using a residual formulation to preserve large-scale structure while reconstructing unresolved variability. The model is trained with flow matching exclusively on reanalysis data and is applied to global medium-range forecasts. We evaluate (i) design consistency by re-coarsening super-resolved forecasts and comparing them to the original coarse trajectories, and (ii) high-resolution forecast quality using standard ensemble verification metrics and spectral diagnostics. Results show that super-resolution preserves large-scale structure and variance after re-coarsening, introduces physically consistent small-scale variability, and achieves competitive probabilistic forecast skill at 0.25° resolution relative to an operational ensemble baseline, while requiring only a modest additional training cost compared with end-to-end high-resolution forecasting.

Paper Structure

This paper contains 56 sections, 19 equations, 17 figures, 1 table.

Figures (17)

  • Figure 1: Overview of the super-resolution procedure. (a) Super-resolution is applied independently to each coarse-resolution forecast state produced by the base forecasting model. (b) Sampling procedure: starting from Gaussian noise, a flow matching model generates a high-resolution residual conditioned on the interpolated coarse-resolution forecast, which is added to obtain the final high-resolution prediction.
  • Figure 2: Comparison of re-coarsened super-resolved forecasts with the original ArchesWeatherGen coarse-resolution trajectories at 1-day lead time. Shown are (a) spatial correlation, (b) activity ratio, and (c) normalized RMSE, for surface variables and selected pressure levels. Color intensity encodes the deviation from the ideal reference: values approaching white indicate near-perfect agreement (unity for correlation and activity ratio, zero for NRMSE), while darker shades correspond to larger departures from the original coarse-resolution trajectories.
  • Figure 3: Global ensemble forecast skill at 0.25°. Relative improvement over IFS ENS for fair ($f$) ensemble skill-score ($ss$) metrics (Ensemble Mean RMSE, CRPS, Energy Score, Brier Score, spread-skill ratio), averaged over WeatherBench headline variables.
  • Figure 4: Per-variable fair CRPS skill scores relative to IFS ENS at 0.25°. Results are shown for GenCast, and ArchesWeatherGen (AWG) after bicubic interpolation and learned super-resolution, as a function of lead time.
  • Figure 5: Comparison of power spectra across models and lead times. Power spectra of Z500, T850, Q700, U850, and V850 at 1-day (top) and 10-day (bottom) lead times. For each model and wavelength, the energy at that wavelength is averaged across samples. The dashed vertical line marks the transition between low- and high-resolution scales (1.5° to 0.25°).
  • ...and 12 more figures