A unified neural background-error covariance model for midlatitude and tropical atmospheric data assimilation
Boštjan Melinc, Uroš Perkan, Žiga Zaplotnik
TL;DR
The paper tackles the challenge of representing background-error covariances in variational data assimilation across both tropical and midlatitude regimes by learning a flow-dependent latent representation via a convolutional autoencoder trained on ERA5 reanalysis. Data assimilation is performed in latent space using a latent-space background-covariance matrix $B_z$, with both climatological $B_z^{clim}$ and ensemble-derived $B_z^{EDA}$ covariances tested to capture balance structures and flow dependence. The study demonstrates that latent-space 3D-Var can yield geostrophic and thermal-wind–balanced increments in the midlatitudes and a latent-heat–driven tropical response for TCWV observations, producing physically plausible forecasts such as a Kelvin wave in the tropics. While ensemble-based covariances show promise for capturing flow-dependent variability, limitations of the autoencoder’s capacity and latent Gaussianity remain, pointing to future work on larger latent spaces and normalizing-flow approaches to enable a full latent-space 4D-Var with improved cross-domain assimilation capabilities.
Abstract
Estimating background-error covariances remains a core challenge in variational data assimilation (DA). Operational systems typically approximate these covariances by transformations that separate geostrophically balanced components from unbalanced inertio-gravity modes - an approach well-suited for the midlatitudes but less applicable in the tropics, where different physical balances prevail. This study estimates background-error covariances in a reduced-dimension latent space learned by a neural-network autoencoder (AE). The AE was trained using 40 years of ERA5 reanalysis data, enabling it to capture flow-dependent atmospheric balances from a diverse set of weather states. We demonstrate that performing DA in the latent space yields analysis increments that preserve multivariate horizontal and vertical physical balances in both tropical and midlatitude atmosphere. Assimilating a single 500 hPa geopotential height observation in the midlatitudes produces increments consistent with geostrophic and thermal wind balance, while assimilating a total column water vapor observation with a positive departure in the nearly-saturated tropical atmosphere generates an increment resembling the tropical response to (latent) heat-induced perturbations. The resulting increments are localized and flow-dependent, and shaped by orography and land-sea contrasts. Forecasts initialized from these analyses exhibit realistic weather evolution, including the excitation of an eastward-propagating Kelvin wave in the tropics. Finally, we explore the transition from using synthetic ensembles and a climatology-based background error covariance matrix to an operational ensemble of data assimilations. Despite significant compression-induced variance loss in some variables, latent-space assimilation produces balanced, flow-dependent increments - highlighting its potential for ensemble-based latent-space 4D-Var.
