Table of Contents
Fetching ...

Measure transport with kernel mean embeddings

L. Wang, N. Nüsken

TL;DR

The paper proposes kernel mean embedding (KME) dynamics to transport probability measures from a prior to a tempered posterior in continuous time, relaxing the strict equality constraint of classical mean-field dynamics. By embedding distributions into a reproducing kernel Hilbert space and enforcing equivalence of kernel mean embeddings, the authors derive a mean-field ODE for particle transport, alongside a kernelised continuity equation and an integral equation for the velocity field. A special case with the quadratic kernel reduces to the Kalman-Bucy filter, linking KME-dynamics to classical Kalman theory, while general kernels offer non-Gaussian flexibility. The framework also includes a score-estimation mechanism via kernel identities and a weighted (importance sampling) variant to compensate for numerical errors, with a variational interpretation that connects to kernelised diffusion maps and diffusion-type PDEs. Numerical experiments on toy problems and Lorenz models illustrate the method's robustness and its Kalman-adjusted variant's practical advantages for data assimilation, suggesting a scalable, nonparametric alternative to ensemble Kalman filtering in nonlinear settings.

Abstract

Kalman filters constitute a scalable and robust methodology for approximate Bayesian inference, matching first and second order moments of the target posterior. To improve the accuracy in nonlinear and non-Gaussian settings, we extend this principle to include more or different characteristics, based on kernel mean embeddings (KMEs) of probability measures into reproducing kernel Hilbert spaces. Focusing on the continuous-time setting, we develop a family of interacting particle systems (termed $\textit{KME-dynamics}$) that bridge between prior and posterior, and that include the Kalman-Bucy filter as a special case. KME-dynamics does not require the score of the target, but rather estimates the score implicitly and intrinsically, and we develop links to score-based generative modeling and importance reweighting. A variant of KME-dynamics has recently been derived from an optimal transport and Fisher-Rao gradient flow perspective by Maurais and Marzouk, and we expose further connections to (kernelised) diffusion maps, leading to a variational formulation of regression type. Finally, we conduct numerical experiments on toy examples and the Lorenz 63 and 96 models, comparing our results against the ensemble Kalman filter and the mapping particle filter (Pulido and van Leeuwen, 2019, J. Comput. Phys.). Our experiments show particular promise for a hybrid modification (called Kalman-adjusted KME-dynamics).

Measure transport with kernel mean embeddings

TL;DR

The paper proposes kernel mean embedding (KME) dynamics to transport probability measures from a prior to a tempered posterior in continuous time, relaxing the strict equality constraint of classical mean-field dynamics. By embedding distributions into a reproducing kernel Hilbert space and enforcing equivalence of kernel mean embeddings, the authors derive a mean-field ODE for particle transport, alongside a kernelised continuity equation and an integral equation for the velocity field. A special case with the quadratic kernel reduces to the Kalman-Bucy filter, linking KME-dynamics to classical Kalman theory, while general kernels offer non-Gaussian flexibility. The framework also includes a score-estimation mechanism via kernel identities and a weighted (importance sampling) variant to compensate for numerical errors, with a variational interpretation that connects to kernelised diffusion maps and diffusion-type PDEs. Numerical experiments on toy problems and Lorenz models illustrate the method's robustness and its Kalman-adjusted variant's practical advantages for data assimilation, suggesting a scalable, nonparametric alternative to ensemble Kalman filtering in nonlinear settings.

Abstract

Kalman filters constitute a scalable and robust methodology for approximate Bayesian inference, matching first and second order moments of the target posterior. To improve the accuracy in nonlinear and non-Gaussian settings, we extend this principle to include more or different characteristics, based on kernel mean embeddings (KMEs) of probability measures into reproducing kernel Hilbert spaces. Focusing on the continuous-time setting, we develop a family of interacting particle systems (termed ) that bridge between prior and posterior, and that include the Kalman-Bucy filter as a special case. KME-dynamics does not require the score of the target, but rather estimates the score implicitly and intrinsically, and we develop links to score-based generative modeling and importance reweighting. A variant of KME-dynamics has recently been derived from an optimal transport and Fisher-Rao gradient flow perspective by Maurais and Marzouk, and we expose further connections to (kernelised) diffusion maps, leading to a variational formulation of regression type. Finally, we conduct numerical experiments on toy examples and the Lorenz 63 and 96 models, comparing our results against the ensemble Kalman filter and the mapping particle filter (Pulido and van Leeuwen, 2019, J. Comput. Phys.). Our experiments show particular promise for a hybrid modification (called Kalman-adjusted KME-dynamics).
Paper Structure (15 sections, 8 theorems, 121 equations, 8 figures, 2 algorithms)

This paper contains 15 sections, 8 theorems, 121 equations, 8 figures, 2 algorithms.

Key Result

Lemma 4

The kernel mean embeddings $\Phi_k(\pi_t)$ and $\Phi_k(\rho_t)$ satisfy the $\mathcal{H}_k$-valued ODEs where $\phi_k$ denotes the feature map defined in eq:feature map, and the covariance is given by

Figures (8)

  • Figure 1: Three Gaussian toy examples.
  • Figure 2: Posterior approximation and target p.d.f. for the one-dimensional skew-normal experiment.
  • Figure 3: We perform different sampling methods on the 'Gaussian prior to Gaussian posterior' toy experiment and evaluate their performances by comparing the estimated normalising constant with the true value. The performance is evaluated against four parameters: dimensionality (top left), sample size (top right), bandwidth of the RBF kernel (bottom left), and the number of iterations for KME-dynamics (bottom right).
  • Figure 4: EnKF vs KMED for different regularisations, for large model error (left), small model error (middle), and tiny model error (right).
  • Figure 5: EnKF vs Kalman-adjusted KMED for different regularisations, for large model error (left), small model error (middle), and tiny model error (right).
  • ...and 3 more figures

Theorems & Definitions (22)

  • Definition 1: Characteristic kernels
  • Lemma 4: Evolution equations for KMEs
  • Remark 5: Kernel Bayes' and Kalman rule
  • Remark 6: Kernelised continuity equation
  • Lemma 7: Properties of $G_{\rho,C}$
  • Remark 8: Other types of regularisation
  • Proposition 9: From KME-dynamics to Kalman-Bucy
  • Remark 10: $\varepsilon \rightarrow 0$
  • Remark 11
  • Remark 12: Affine invariance
  • ...and 12 more