Table of Contents
Fetching ...

Marchuk: Efficient Global Weather Forecasting from Mid-Range to Sub-Seasonal Scales via Flow Matching

Arsen Kuzhamuratov, Mikhail Zhirnov, Andrey Kuznetsov, Ivan Oseledets, Konstantin Sobolev

Abstract

Accurate subseasonal weather forecasting remains a major challenge due to the inherently chaotic nature of the atmosphere, which limits the predictive skill of conventional models beyond the mid-range horizon (approximately 15 days). In this work, we present \textit{Marchuk}, a generative latent flow-matching model for global weather forecasting spanning mid-range to subseasonal timescales, with prediction horizons of up to 30 days. Marchuk conditions on current-day weather maps and autoregressively predicts subsequent days' weather maps within the learned latent space. We replace rotary positional encodings (RoPE) with trainable positional embeddings and extend the temporal context window, which together enhance the model's ability to represent and propagate long-range temporal dependencies during latent forecasting. Marchuk offers two key advantages: high computational efficiency and strong predictive performance. Despite its compact architecture of only 276 million parameters, the model achieves performance comparable to LaDCast, a substantially larger model with 1.6 billion parameters, while operating at significantly higher inference speeds. We open-source our inference code and model at: https://v-gen-ai.github.io/Marchuk/

Marchuk: Efficient Global Weather Forecasting from Mid-Range to Sub-Seasonal Scales via Flow Matching

Abstract

Accurate subseasonal weather forecasting remains a major challenge due to the inherently chaotic nature of the atmosphere, which limits the predictive skill of conventional models beyond the mid-range horizon (approximately 15 days). In this work, we present \textit{Marchuk}, a generative latent flow-matching model for global weather forecasting spanning mid-range to subseasonal timescales, with prediction horizons of up to 30 days. Marchuk conditions on current-day weather maps and autoregressively predicts subsequent days' weather maps within the learned latent space. We replace rotary positional encodings (RoPE) with trainable positional embeddings and extend the temporal context window, which together enhance the model's ability to represent and propagate long-range temporal dependencies during latent forecasting. Marchuk offers two key advantages: high computational efficiency and strong predictive performance. Despite its compact architecture of only 276 million parameters, the model achieves performance comparable to LaDCast, a substantially larger model with 1.6 billion parameters, while operating at significantly higher inference speeds. We open-source our inference code and model at: https://v-gen-ai.github.io/Marchuk/

Paper Structure

This paper contains 16 sections, 9 figures, 11 tables.

Figures (9)

  • Figure 1: Marchuk operates in the latent space learned by the DC-AE model introduced in LaDCast. Within this latent space, we train a flow-matching Diffusion Transformer (DiT) to model the conditional distribution of future weather fields given the current state, enabling efficient and accurate probabilistic forecasting.
  • Figure 2: Marchuk is conditioned on weather maps from the previous $K$ days and is provided with noised weather maps for the subsequent $N$ days as input. The model generates refined forecasts for the next $N$ days, effectively denoising and predicting future weather states.
  • Figure 3: Marchuk is flow matching DiT model with additional architecture optimizations.
  • Figure 4: RMSE comparison. We evaluate LaDCast and Marchuk on the WeatherBench-2 benchmark over a 30-day prediction horizon. The atmospheric variables include the u-component of wind, temperature, geopotential, and specific humidity at 500, 750, and 850 hPa levels. Surface variables include mean sea level pressure, 10-meter wind u- and v-components, and temperature at 2 meters. Marchuk outperforms the small LaDCast model and achieves comparable performance to the large LaDCast model. Beyond approximately 30 days, both models’ forecasts converge toward climatology, indicating that extending the prediction horizon further provides limited practical value.
  • Figure 5: CRPS Ensemble Metrics Comparison. Figure illustrates the evolution of CRPS over a 30-day forecast horizon. The first row presents results for atmospheric variables at the 500 hPa level, including the u-component of wind, temperature, geopotential, and specific humidity. The second row shows CRPS for surface variables, namely mean sea level pressure, the 10-meter u-component of wind, 2-meter temperature, and total accumulated precipitation over 6 hours.
  • ...and 4 more figures