Table of Contents
Fetching ...

Generating ensembles of spatially-coherent in-situ forecasts using flow matching

David Landry, Claire Monteleoni, Anastase Charantonis

TL;DR

FMAP addresses the need for spatially coherent, multivariate postprocessing of in-situ forecasts by learning a flow-matching generative model conditioned on gridded forecasts. It leverages a spatial attention transformer backbone to model cross-station dependencies and to generate an arbitrary number of forecast realizations from a fixed dataset, using a single training across lead times up to $5$ days. In experiments on the EUPPBench dataset for surface temperature and wind gust at 122 stations in western Europe, FMAP provides improved multivariate metrics and competitive marginal scores, with favorable spectral properties and a visual case study. The work discusses limitations such as inference cost and outlines future extensions to spatio-temporal generation and heavy-tailed distributions for precipitation and extremes.

Abstract

We propose a machine-learning-based methodology for in-situ weather forecast postprocessing that is both spatially coherent and multivariate. Compared to previous work, our Flow MAtching Postprocessing (FMAP) better represents the correlation structures of the observations distribution, while also improving marginal performance at the stations. FMAP generates forecasts that are not bound to what is already modeled by the underlying gridded prediction and can infer new correlation structures from data. The resulting model can generate an arbitrary number of forecasts from a limited number of numerical simulations, allowing for low-cost forecasting systems. A single training is sufficient to perform postprocessing at multiple lead times, in contrast with other methods which use multiple trained networks at generation time. This work details our methodology, including a spatial attention transformer backbone trained within a flow matching generative modeling framework. FMAP shows promising performance in experiments on the EUPPBench dataset, forecasting surface temperature and wind gust values at station locations in western Europe up to five-day lead times.

Generating ensembles of spatially-coherent in-situ forecasts using flow matching

TL;DR

FMAP addresses the need for spatially coherent, multivariate postprocessing of in-situ forecasts by learning a flow-matching generative model conditioned on gridded forecasts. It leverages a spatial attention transformer backbone to model cross-station dependencies and to generate an arbitrary number of forecast realizations from a fixed dataset, using a single training across lead times up to days. In experiments on the EUPPBench dataset for surface temperature and wind gust at 122 stations in western Europe, FMAP provides improved multivariate metrics and competitive marginal scores, with favorable spectral properties and a visual case study. The work discusses limitations such as inference cost and outlines future extensions to spatio-temporal generation and heavy-tailed distributions for precipitation and extremes.

Abstract

We propose a machine-learning-based methodology for in-situ weather forecast postprocessing that is both spatially coherent and multivariate. Compared to previous work, our Flow MAtching Postprocessing (FMAP) better represents the correlation structures of the observations distribution, while also improving marginal performance at the stations. FMAP generates forecasts that are not bound to what is already modeled by the underlying gridded prediction and can infer new correlation structures from data. The resulting model can generate an arbitrary number of forecasts from a limited number of numerical simulations, allowing for low-cost forecasting systems. A single training is sufficient to perform postprocessing at multiple lead times, in contrast with other methods which use multiple trained networks at generation time. This work details our methodology, including a spatial attention transformer backbone trained within a flow matching generative modeling framework. FMAP shows promising performance in experiments on the EUPPBench dataset, forecasting surface temperature and wind gust values at station locations in western Europe up to five-day lead times.

Paper Structure

This paper contains 40 sections, 22 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: A) The flow matching generation process uses a transformer backbone to iteratively turn an easily-sampled random state into postprocessed in situ forecast. Rather than predicting the desired state directly, the generative process predicts the residual from the raw ensemble mean. B) Transformer architecture producing the next flow matching state. The predictions are made using conditioning features from the underlying forecast $\bm{C}_t$, the previous flow matching state $\bm{z}_s$, and the flow matching time $s$. C) Input sequence construction. The input values are concatenated together. The result is further processed with a linear mapping and a station embedding before being dispatched to the transformer blocks. The grayed-out symbol describe the size of the data dimensions.
  • Figure 2: Flow matching starts from a known distribution $p_0(\bm{z})$ to build an approximation $p_1(\bm{z})$ of target distribution $q(\bm{z})$. The process takes place during flow matching time $s$.
  • Figure 3: Postprocessing model skill scores for spatially coherent forecasts according to lead time. Higher is better. The baseline for skill scores is the Distribution Regression Network with Ensemble Copula Coupling (DRN-ECC). Shaded areas are the result of a pairwise bootstrap procedure with 5 to 95% confidence interval.
  • Figure 4: Postprocessing model spread-error ratios.
  • Figure 5: Sample forecasts for our model (FMAP) and the Energy Score Generative Model (ESGM). The values are displayed as anomalies according to a rolling window climatology. The framed maps represent the mean of the generated ensembles.
  • ...and 1 more figures