Table of Contents
Fetching ...

A unified neural background-error covariance model for midlatitude and tropical atmospheric data assimilation

Boštjan Melinc, Uroš Perkan, Žiga Zaplotnik

TL;DR

The paper tackles the challenge of representing background-error covariances in variational data assimilation across both tropical and midlatitude regimes by learning a flow-dependent latent representation via a convolutional autoencoder trained on ERA5 reanalysis. Data assimilation is performed in latent space using a latent-space background-covariance matrix $B_z$, with both climatological $B_z^{clim}$ and ensemble-derived $B_z^{EDA}$ covariances tested to capture balance structures and flow dependence. The study demonstrates that latent-space 3D-Var can yield geostrophic and thermal-wind–balanced increments in the midlatitudes and a latent-heat–driven tropical response for TCWV observations, producing physically plausible forecasts such as a Kelvin wave in the tropics. While ensemble-based covariances show promise for capturing flow-dependent variability, limitations of the autoencoder’s capacity and latent Gaussianity remain, pointing to future work on larger latent spaces and normalizing-flow approaches to enable a full latent-space 4D-Var with improved cross-domain assimilation capabilities.

Abstract

Estimating background-error covariances remains a core challenge in variational data assimilation (DA). Operational systems typically approximate these covariances by transformations that separate geostrophically balanced components from unbalanced inertio-gravity modes - an approach well-suited for the midlatitudes but less applicable in the tropics, where different physical balances prevail. This study estimates background-error covariances in a reduced-dimension latent space learned by a neural-network autoencoder (AE). The AE was trained using 40 years of ERA5 reanalysis data, enabling it to capture flow-dependent atmospheric balances from a diverse set of weather states. We demonstrate that performing DA in the latent space yields analysis increments that preserve multivariate horizontal and vertical physical balances in both tropical and midlatitude atmosphere. Assimilating a single 500 hPa geopotential height observation in the midlatitudes produces increments consistent with geostrophic and thermal wind balance, while assimilating a total column water vapor observation with a positive departure in the nearly-saturated tropical atmosphere generates an increment resembling the tropical response to (latent) heat-induced perturbations. The resulting increments are localized and flow-dependent, and shaped by orography and land-sea contrasts. Forecasts initialized from these analyses exhibit realistic weather evolution, including the excitation of an eastward-propagating Kelvin wave in the tropics. Finally, we explore the transition from using synthetic ensembles and a climatology-based background error covariance matrix to an operational ensemble of data assimilations. Despite significant compression-induced variance loss in some variables, latent-space assimilation produces balanced, flow-dependent increments - highlighting its potential for ensemble-based latent-space 4D-Var.

A unified neural background-error covariance model for midlatitude and tropical atmospheric data assimilation

TL;DR

The paper tackles the challenge of representing background-error covariances in variational data assimilation across both tropical and midlatitude regimes by learning a flow-dependent latent representation via a convolutional autoencoder trained on ERA5 reanalysis. Data assimilation is performed in latent space using a latent-space background-covariance matrix , with both climatological and ensemble-derived covariances tested to capture balance structures and flow dependence. The study demonstrates that latent-space 3D-Var can yield geostrophic and thermal-wind–balanced increments in the midlatitudes and a latent-heat–driven tropical response for TCWV observations, producing physically plausible forecasts such as a Kelvin wave in the tropics. While ensemble-based covariances show promise for capturing flow-dependent variability, limitations of the autoencoder’s capacity and latent Gaussianity remain, pointing to future work on larger latent spaces and normalizing-flow approaches to enable a full latent-space 4D-Var with improved cross-domain assimilation capabilities.

Abstract

Estimating background-error covariances remains a core challenge in variational data assimilation (DA). Operational systems typically approximate these covariances by transformations that separate geostrophically balanced components from unbalanced inertio-gravity modes - an approach well-suited for the midlatitudes but less applicable in the tropics, where different physical balances prevail. This study estimates background-error covariances in a reduced-dimension latent space learned by a neural-network autoencoder (AE). The AE was trained using 40 years of ERA5 reanalysis data, enabling it to capture flow-dependent atmospheric balances from a diverse set of weather states. We demonstrate that performing DA in the latent space yields analysis increments that preserve multivariate horizontal and vertical physical balances in both tropical and midlatitude atmosphere. Assimilating a single 500 hPa geopotential height observation in the midlatitudes produces increments consistent with geostrophic and thermal wind balance, while assimilating a total column water vapor observation with a positive departure in the nearly-saturated tropical atmosphere generates an increment resembling the tropical response to (latent) heat-induced perturbations. The resulting increments are localized and flow-dependent, and shaped by orography and land-sea contrasts. Forecasts initialized from these analyses exhibit realistic weather evolution, including the excitation of an eastward-propagating Kelvin wave in the tropics. Finally, we explore the transition from using synthetic ensembles and a climatology-based background error covariance matrix to an operational ensemble of data assimilations. Despite significant compression-induced variance loss in some variables, latent-space assimilation produces balanced, flow-dependent increments - highlighting its potential for ensemble-based latent-space 4D-Var.

Paper Structure

This paper contains 15 sections, 7 equations, 19 figures, 1 table.

Figures (19)

  • Figure 1: Analysis increments following an assimilation of Z500 observation above Ljubljana with departure of 30 m and observation-error standard deviation of 10 m. (a) Z500 increment (colors) and 500 hPa horizontal wind increment (arrows); (b) T500 increment; (c) T2m increment (colors) and MSLP increment (the two purple contours denote $+0.15$ hPa, and $+0.30$ hPa increments). The observation location is denoted by a golden star.
  • Figure 2: Vertical cross sections of analysis increments following an assimilation of Z500 observation. The cross section is done at the grid latitude or longitude nearest to the observation. (a) 2D longitude-pressure cross section of geopotential height increment at latitude $46.5\,^\circ\mathrm{N}$. (b) 2D latitude-pressure cross section of zonal wind increment at longitude $14.0\,^\circ\mathrm{E}$. (c) 2D latitude-pressure cross section of meridional wind at latitude $46.5\,^\circ\mathrm{N}$. (d-f) As (a-c), but showing the normalized relative impact of the observation. Gaussian filtering with a standard deviation of 1 $^\circ$ was applied in both horizontal directions to smoothen the contours. The observation location is denoted by a golden star.
  • Figure 3: A comparison of (a) the difference in U500 and U700 analysis increments to (b) their difference derived from the thermal wind approximation in Eq. \ref{['eq:thermal wind']}. (c) The difference ((a)$-$(b)).
  • Figure 4: Z500 analysis increments following assimilation of Z500 observation above Ljubljana with 30 m departure and 10 m standard deviation on two different dates with different backgrounds: (a) January 1st, 2020, at 00 UTC, and (b) January 8th, 2020, at 00 UTC. The arrows denote the background 500 hPa wind. The observation location is marked by a golden star.
  • Figure 5: Difference between the two forecasts initialized from the analysis and the background, respectively, for a selected ensemble member, based on the experiment in Fig. \ref{['fig:Ljubljana']}. Difference in the initial condition (the analysis increment) for (a) Z500 (colors) and 500 hPa wind (arrows), (b) total-column water vapor (colors) and MSLP (purple contours), and (c) T2m. (d-f) As (a-c), but for the 24-hour forecast lead time. (g-i) As (d-f), but for the 48-hour forecast lead time. The solid/dashed contours in (b,e,h) indicate a positive/negative difference with 0.5 hPa step, zero contour is omitted.
  • ...and 14 more figures