Table of Contents
Fetching ...

Ensemble Kalman filter in latent space using a variational autoencoder pair

Ivo Pasmans, Yumeng Chen, Tobias Sebastian Finn, Marc Bocquet, Alberto Carrassi

TL;DR

This work addresses the challenge of non-Gaussian errors and constrained dynamics in data assimilation by performing ETKF updates in the latent space of variational autoencoders. A single-VAE approach confines ensemble members to a physically meaningful manifold, while a double-VAE variant targets observational innovations to mitigate non-Gaussian bias. Online retraining (transfer learning) of the first VAE is shown to be essential when the underlying manifold changes over time, and the second VAE provides robustness to non-Gaussian observation errors, particularly under strong skewness. Overall, latent-space ETKF-VAEs improve distributional fidelity and physical consistency compared to standard ETKF, with practical implications for complex geophysical systems such as sea-ice models.

Abstract

Popular (ensemble) Kalman filter data assimilation (DA) approaches assume that the errors in both the a priori estimate of the state and those in the observations are Gaussian. For constrained variables, e.g. sea ice concentration or stress, such an assumption does not hold. The variational autoencoder (VAE) is a machine learning (ML) technique that allows to map an arbitrary distribution to/from a latent space in which the distribution is supposedly closer to a Gaussian. We propose a novel hybrid DA-ML approach in which VAEs are incorporated in the DA procedure. Specifically, we introduce a variant of the popular ensemble transform Kalman filter (ETKF) in which the analysis is applied in the latent space of a single VAE or a pair of VAEs. In twin experiments with a simple circular model, whereby the circle represents an underlying submanifold to be respected, we find that the use of a VAE ensures that a posteri ensemble members lie close to the manifold containing the truth. Furthermore, online updating of the VAE is necessary and achievable when this manifold varies in time, i.e. when it is non-stationary. We demonstrate that introducing an additional second latent space for the observational innovations improves robustness against detrimental effects of non-Gaussianity and bias in the observational errors but it slightly lessens the performance if observational errors are strictly Gaussian.

Ensemble Kalman filter in latent space using a variational autoencoder pair

TL;DR

This work addresses the challenge of non-Gaussian errors and constrained dynamics in data assimilation by performing ETKF updates in the latent space of variational autoencoders. A single-VAE approach confines ensemble members to a physically meaningful manifold, while a double-VAE variant targets observational innovations to mitigate non-Gaussian bias. Online retraining (transfer learning) of the first VAE is shown to be essential when the underlying manifold changes over time, and the second VAE provides robustness to non-Gaussian observation errors, particularly under strong skewness. Overall, latent-space ETKF-VAEs improve distributional fidelity and physical consistency compared to standard ETKF, with practical implications for complex geophysical systems such as sea-ice models.

Abstract

Popular (ensemble) Kalman filter data assimilation (DA) approaches assume that the errors in both the a priori estimate of the state and those in the observations are Gaussian. For constrained variables, e.g. sea ice concentration or stress, such an assumption does not hold. The variational autoencoder (VAE) is a machine learning (ML) technique that allows to map an arbitrary distribution to/from a latent space in which the distribution is supposedly closer to a Gaussian. We propose a novel hybrid DA-ML approach in which VAEs are incorporated in the DA procedure. Specifically, we introduce a variant of the popular ensemble transform Kalman filter (ETKF) in which the analysis is applied in the latent space of a single VAE or a pair of VAEs. In twin experiments with a simple circular model, whereby the circle represents an underlying submanifold to be respected, we find that the use of a VAE ensures that a posteri ensemble members lie close to the manifold containing the truth. Furthermore, online updating of the VAE is necessary and achievable when this manifold varies in time, i.e. when it is non-stationary. We demonstrate that introducing an additional second latent space for the observational innovations improves robustness against detrimental effects of non-Gaussianity and bias in the observational errors but it slightly lessens the performance if observational errors are strictly Gaussian.

Paper Structure

This paper contains 20 sections, 16 equations, 11 figures, 3 tables, 3 algorithms.

Figures (11)

  • Figure 1: Schematic overview of the (top) single ETKF-VAE and (top+bottom) double ETKF-VAE approach. (a) Alternative innovations are generated by drawing ensemble members and adding realisations of the observational error, (b) the first and second VAE are trained on the forecast ensemble and alternative innovations respectively, (c) the first encoder is used to sample one ensemble member in latent space for each ensemble member in state space, (d) the innovation-encoder is used to sample $K$ perturbed innovations and $\mathop{\mathrm{M}}\nolimits$ unperturbed innovations in latent space (e) the ETKF is performed using the ensembles of states, perturbed innovations and innovations, (f) for each member in the analysis ensemble the first decoder samples a member in the state space.
  • Figure 2: Architecture of the first variational autoencoder. The encoder consists of (a) common input nodes, (b) $6$ hidden layers for the encoder predicting the mean of the conditional distribution $\mu_{\phi}$ and (c) $6$ layers for the encoder predicting its variance $\ln \Sigma_{\phi}$. Prior to (e) outputting, latent mean and log variance are (e) rescaled using an affine transformation (see text). The decoder (f) accepts a latent state as input, (g) applies the inverse of the latent affine transformation to the value and feeds the value to (h,i) a pair of $6$ hidden layers outputting the (j) mean $\mu_{\theta}$ and (k) log of variance $\ln \Sigma_{\theta}$ in state space. For clarity only $8$ of the $32$ nodes in each hidden layer are shown.
  • Figure 3: (a) Climatology in the physical space generated by feeding samples from a standard normal distribution, and draw a sample from each of these using the first decoder. (b) Climatology in the latent space obtained by taking states from the climatology, and for each state drawing a sample in the latent space using the first encoder (red) together with the standard normal (black). The network architecture used $6$ dense hidden layers and $32$ nodes per hidden layer.
  • Figure 4: Taylor diagrams for (a) x-coordinate, (b) y-coordinate, (c) radius and (d) angle. The standard deviation of the time series of the forecast mean is shown along the radial, the correlation with the truth along the azimuthal and the RMSE as dashed lines. Bars indicate the 90%-confidence interval.
  • Figure 5: As Figure \ref{['fig1:taylorFor']} but now showing standard deviations and correlations with the mean of the analysis ensemble.
  • ...and 6 more figures