Table of Contents
Fetching ...

Nonlinear Bayesian Filtering with Natural Gradient Gaussian Approximation

Wenhan Cao, Tianyi Zhang, Zeju Sun, Chang Liu, Stephen S. -T. Yau, Shengbo Eben Li

TL;DR

This work reframes nonlinear Bayesian filtering with Gaussian approximations as two optimization problems: a moment-matching prediction step and a nonlinear, coupled update step. The authors derive the NANO filter, which pairs exact moment-based prediction with a natural-gradient update on the Gaussian manifold, leveraging the Fisher information to account for parameter-space curvature. They prove local convergence to the optimal Gaussian approximation and exponential mean-square stability under near-linear measurements and low noise, and extend the framework to Gibbs posteriors for robustness. Empirical results across linear, nonlinear, and real-world tracking tasks show that NANO and its robust variants outperform standard Gaussian filters (e.g., EKF, UKF, IEKF, PLF) while maintaining feasible computation, supporting practical utility in complex state estimation scenarios.

Abstract

Practical Bayes filters often assume the state distribution of each time step to be Gaussian for computational tractability, resulting in the so-called Gaussian filters. When facing nonlinear systems, Gaussian filters such as extended Kalman filter (EKF) or unscented Kalman filter (UKF) typically rely on certain linearization techniques, which can introduce large estimation errors. To address this issue, this paper reconstructs the prediction and update steps of Gaussian filtering as solutions to two distinct optimization problems, whose optimal conditions are found to have analytical forms from Stein's lemma. It is observed that the stationary point for the prediction step requires calculating the first two moments of the prior distribution, which is equivalent to that step in existing moment-matching filters. In the update step, instead of linearizing the model to approximate the stationary points, we propose an iterative approach to directly minimize the update step's objective to avoid linearization errors. For the purpose of performing the steepest descent on the Gaussian manifold, we derive its natural gradient that leverages Fisher information matrix to adjust the gradient direction, accounting for the curvature of the parameter space. Combining this update step with moment matching in the prediction step, we introduce a new iterative filter for nonlinear systems called \textit{N}atural Gr\textit{a}dient Gaussia\textit{n} Appr\textit{o}ximation filter, or NANO filter for short. We prove that NANO filter locally converges to the optimal Gaussian approximation at each time step. Furthermore, the estimation error is proven exponentially bounded for nearly linear measurement equation and low noise levels through constructing a supermartingale-like property across consecutive time steps.

Nonlinear Bayesian Filtering with Natural Gradient Gaussian Approximation

TL;DR

This work reframes nonlinear Bayesian filtering with Gaussian approximations as two optimization problems: a moment-matching prediction step and a nonlinear, coupled update step. The authors derive the NANO filter, which pairs exact moment-based prediction with a natural-gradient update on the Gaussian manifold, leveraging the Fisher information to account for parameter-space curvature. They prove local convergence to the optimal Gaussian approximation and exponential mean-square stability under near-linear measurements and low noise, and extend the framework to Gibbs posteriors for robustness. Empirical results across linear, nonlinear, and real-world tracking tasks show that NANO and its robust variants outperform standard Gaussian filters (e.g., EKF, UKF, IEKF, PLF) while maintaining feasible computation, supporting practical utility in complex state estimation scenarios.

Abstract

Practical Bayes filters often assume the state distribution of each time step to be Gaussian for computational tractability, resulting in the so-called Gaussian filters. When facing nonlinear systems, Gaussian filters such as extended Kalman filter (EKF) or unscented Kalman filter (UKF) typically rely on certain linearization techniques, which can introduce large estimation errors. To address this issue, this paper reconstructs the prediction and update steps of Gaussian filtering as solutions to two distinct optimization problems, whose optimal conditions are found to have analytical forms from Stein's lemma. It is observed that the stationary point for the prediction step requires calculating the first two moments of the prior distribution, which is equivalent to that step in existing moment-matching filters. In the update step, instead of linearizing the model to approximate the stationary points, we propose an iterative approach to directly minimize the update step's objective to avoid linearization errors. For the purpose of performing the steepest descent on the Gaussian manifold, we derive its natural gradient that leverages Fisher information matrix to adjust the gradient direction, accounting for the curvature of the parameter space. Combining this update step with moment matching in the prediction step, we introduce a new iterative filter for nonlinear systems called \textit{N}atural Gr\textit{a}dient Gaussia\textit{n} Appr\textit{o}ximation filter, or NANO filter for short. We prove that NANO filter locally converges to the optimal Gaussian approximation at each time step. Furthermore, the estimation error is proven exponentially bounded for nearly linear measurement equation and low noise levels through constructing a supermartingale-like property across consecutive time steps.

Paper Structure

This paper contains 17 sections, 7 theorems, 102 equations, 8 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

The prior distribution can be regraded as the maxmizer of an variational problem: Similarly, the posterior distribution can be regarded as the minimizer of a functional: Note that in both eq.BF prediction optimizaton and eq.BF update optimizaton, $q: \mathbb{R}^n \to \mathbb{R}$ represents the candidate density function. Besides, we use the notation $\mathbb{E}_{\substack{p(x)\\p(y)}}\left\{f(x,

Figures (8)

  • Figure 1: Box plot of RMSE of KF, UKF, PLF and NANO filter, for the standard Wiener velocity model. Note that the small black square " $\blacksquare$ " represents the average RMSE over all the Monte Carlo experiments.
  • Figure 2: Box plot of RMSE for KF, UKF, PLF, NANO filter, and its robust variants with various parameter values, for the Wiener velocity model with measurement outliers.
  • Figure 3: Box plot of RMSE for EKF, UKF, IEKF, PLF and NANO filter, for the standard air-traffic control model.
  • Figure 4: Box plot of RMSE for EKF, UKF, IEKF, PLF, NANO filter, and its robust variants with various parameter values, for the air-traffic model with measurement outliers.
  • Figure 5: The UGV and the experiment field. The three red traffic cones serve as landmarks for positioning.
  • ...and 3 more figures

Theorems & Definitions (19)

  • Proposition 1: Variational Problems for Bayesian filtering
  • proof
  • Lemma 1: Stationary Points for Maximum Gaussian Likelihood
  • proof
  • Example 1: Prediction step of Kalman filter
  • Lemma 2: Gradient of expectation under Gaussian distribution
  • proof
  • Example 2: Update step of Kalman filter
  • Remark 1
  • Proposition 2
  • ...and 9 more