Incorporating Multivariate Consistency in ML-Based Weather Forecasting with Latent-space Constraints
Hang Fan, Yi Xiao, Yongquan Qu, Fenghua Ling, Ben Fei, Lei Bai, Pierre Gentine
TL;DR
The authors address blurring and physical-inconsistency in ML-based weather forecasts by reframing rollout training as a weak-constraint 4DVar problem and replacing model-space loss with latent-space constraints derived from an autoencoder. By approximating the reanalysis error covariance in latent space as near-diagonal, they implement a tractable loss that preserves multivariate dependencies and scale interactions, improving long-range skill and fine-scale structure while maintaining physical realism. Experiments on a coarsened ERA5 dataset show that latent-space constrained models (DFM-LC) outperform model-space constrained counterparts (DFM-MC) in multiscale fidelity and dynamical balance, though with computational cost and some long-range RMSE trade-offs. The framework further extends to heterogeneous data sources, offering a unified objective for integrating reanalysis and observations in deterministic forecast training, with potential extensions to probabilistic forecasts and Earth-system coupling.
Abstract
Data-driven machine learning (ML) models have recently shown promise in surpassing traditional physics-based approaches for weather forecasting, leading to a so-called second revolution in weather forecasting. However, most ML-based forecast models treat reanalysis as the truth and are trained under variable-specific loss weighting, ignoring their physical coupling and spatial structure. Over long time horizons, the forecasts become blurry and physically unrealistic under rollout training. To address this, we reinterpret model training as a weak-constraint four-dimensional variational data assimilation (WC-4DVar) problem, treating reanalysis data as imperfect observations. This allows the loss function to incorporate reanalysis error covariance and capture multivariate dependencies. In practice, we compute the loss in a latent space learned by an autoencoder (AE), where the reanalysis error covariance becomes approximately diagonal, thus avoiding the need to explicitly model it in the high-dimensional model space. We show that rollout training with latent-space constraints improves long-term forecast skill and better preserves fine-scale structures and physical realism compared to training with model-space loss. Finally, we extend this framework to accommodate heterogeneous data sources, enabling the forecast model to be trained jointly on reanalysis and multi-source observations within a unified theoretical formulation.
