Generating ensembles of spatially-coherent in-situ forecasts using flow matching
David Landry, Claire Monteleoni, Anastase Charantonis
TL;DR
FMAP addresses the need for spatially coherent, multivariate postprocessing of in-situ forecasts by learning a flow-matching generative model conditioned on gridded forecasts. It leverages a spatial attention transformer backbone to model cross-station dependencies and to generate an arbitrary number of forecast realizations from a fixed dataset, using a single training across lead times up to $5$ days. In experiments on the EUPPBench dataset for surface temperature and wind gust at 122 stations in western Europe, FMAP provides improved multivariate metrics and competitive marginal scores, with favorable spectral properties and a visual case study. The work discusses limitations such as inference cost and outlines future extensions to spatio-temporal generation and heavy-tailed distributions for precipitation and extremes.
Abstract
We propose a machine-learning-based methodology for in-situ weather forecast postprocessing that is both spatially coherent and multivariate. Compared to previous work, our Flow MAtching Postprocessing (FMAP) better represents the correlation structures of the observations distribution, while also improving marginal performance at the stations. FMAP generates forecasts that are not bound to what is already modeled by the underlying gridded prediction and can infer new correlation structures from data. The resulting model can generate an arbitrary number of forecasts from a limited number of numerical simulations, allowing for low-cost forecasting systems. A single training is sufficient to perform postprocessing at multiple lead times, in contrast with other methods which use multiple trained networks at generation time. This work details our methodology, including a spatial attention transformer backbone trained within a flow matching generative modeling framework. FMAP shows promising performance in experiments on the EUPPBench dataset, forecasting surface temperature and wind gust values at station locations in western Europe up to five-day lead times.
