Table of Contents
Fetching ...

Fast filtering of non-Gaussian models using Amortized Optimal Transport Maps

Mohammad Al-Jarrah, Bamdad Hosseini, Amirhossein Taghvaei

TL;DR

The paper tackles real-time nonlinear filtering with non-Gaussian posteriors, where standard OT-based filters (OTF) incur high online computation due to map training at each step. It introduces Amortized Optimal Transport Filter (A-OTF), an offline-online framework that pre-trains a library of OT maps and online estimates the current Bayesian update as a localized, weighted combination of these maps via kernel-like interpolation and clustering. The authors provide a detailed offline stage using K-medoids to cluster pre-trained maps and an online stage that forms $T_t$ as a weighted sum of cluster-specific maps; theoretical intuition from nonparametric estimation supports consistency under regularity assumptions. Numerical experiments on Lorenz 63 and high-dimensional linear-quadratic models show substantial online computational savings with competitive accuracy relative to EnKF, SIR, and OTF, and discuss robustness with respect to distance metrics and dimensionality. This work enables practical, real-time non-Gaussian filtering by leveraging transport-based updates without online map training, with potential broad applicability to complex dynamical systems.

Abstract

In this paper, we present the amortized optimal transport filter (A-OTF) designed to mitigate the computational burden associated with the real-time training of optimal transport filters (OTFs). OTFs can perform accurate non-Gaussian Bayesian updates in the filtering procedure, but they require training at every time step, which makes them expensive. The proposed A-OTF framework exploits the similarity between OTF maps during an initial/offline training stage in order to reduce the cost of inference during online calculations. More precisely, we use clustering algorithms to select relevant subsets of pre-trained maps whose weighted average is used to compute the A-OTF model akin to a mixture of experts. A series of numerical experiments validate that A-OTF achieves substantial computational savings during online inference while preserving the inherent flexibility and accuracy of OTF.

Fast filtering of non-Gaussian models using Amortized Optimal Transport Maps

TL;DR

The paper tackles real-time nonlinear filtering with non-Gaussian posteriors, where standard OT-based filters (OTF) incur high online computation due to map training at each step. It introduces Amortized Optimal Transport Filter (A-OTF), an offline-online framework that pre-trains a library of OT maps and online estimates the current Bayesian update as a localized, weighted combination of these maps via kernel-like interpolation and clustering. The authors provide a detailed offline stage using K-medoids to cluster pre-trained maps and an online stage that forms as a weighted sum of cluster-specific maps; theoretical intuition from nonparametric estimation supports consistency under regularity assumptions. Numerical experiments on Lorenz 63 and high-dimensional linear-quadratic models show substantial online computational savings with competitive accuracy relative to EnKF, SIR, and OTF, and discuss robustness with respect to distance metrics and dimensionality. This work enables practical, real-time non-Gaussian filtering by leveraging transport-based updates without online map training, with potential broad applicability to complex dynamical systems.

Abstract

In this paper, we present the amortized optimal transport filter (A-OTF) designed to mitigate the computational burden associated with the real-time training of optimal transport filters (OTFs). OTFs can perform accurate non-Gaussian Bayesian updates in the filtering procedure, but they require training at every time step, which makes them expensive. The proposed A-OTF framework exploits the similarity between OTF maps during an initial/offline training stage in order to reduce the cost of inference during online calculations. More precisely, we use clustering algorithms to select relevant subsets of pre-trained maps whose weighted average is used to compute the A-OTF model akin to a mixture of experts. A series of numerical experiments validate that A-OTF achieves substantial computational savings during online inference while preserving the inherent flexibility and accuracy of OTF.

Paper Structure

This paper contains 15 sections, 16 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Numerical results for the Lorenz 63 example, in Section \ref{['sec:L63']}. The left column shows the particle trajectory distributions of the first unobserved components of the particles along with the true distribution, where $A-OTF$ uses $d_{W_2},\rho_{W_2}$, $K=20$, and $\lambda=1$. The right column presents the empirical $W_2$ distances between each method and the true distribution as a function of the number of particles $N$ for a fixed $\mu_0 = 4$ and $\sigma_0=5$ averaged over five independent simulations and each simulation consists of 500 time steps.
  • Figure 2: Numerical results for the quadratic observation example in Section \ref{['sec:lin-example']}. The left panel shows the particles distribution for different methods compared to the true distribution as a function of time. The right panel shows the empirical $W_2$ distance between the true distribution and the output distribution of each method, averaged over five independent simulations as a function of the number of particles $N$.
  • Figure 3: Numerical results for the quadratic observation example in Section \ref{['sec:lin-example']}. The left column shows the particle trajectory distributions of the first unobserved components of the particles along with the true distribution. The right three columns show the empirical $W_2$ distance between the true distribution and the output distribution of each method, averaged over five independent simulations and $50$ time steps, as a function of the number $K$ of selected maps according to $d_{W_2\!}$ (left panel), $d_T$ (middle panel), and $d_{MMD\!}$ (right panel), respectively.
  • Figure 4: Numerical results for the quadratic observation example in Section \ref{['sec:lin-example']}. The first two panels present the empirical $W_2$ distances between each method and the true distribution as a function of the particle mean $\mu_0$ and variance $\sigma_0$, while the amortized pre-trained maps are fixed. The third and fourth panels present the same error metric as as a function of the number of particles $N$ and the computational time, respectively. All results are averaged over five independent simulations.

Theorems & Definitions (2)

  • Remark 1
  • Remark 2