Table of Contents
Fetching ...

A Computationally Efficient Maximum A Posteriori Sequence Estimation via Stein Variational Inference

Min-Won Seo, Solmaz S. Kia

TL;DR

This work tackles MAP sequence estimation under multimodal posteriors in robotics by coupling SVGD with a Viterbi-style DP in a sequential variational framework. A two-stage approach first builds a compact, particle-based discretization of time-evolving posteriors via SVGD, then performs forward DP and backtracking to recover a globally optimal MAP trajectory, with theoretical guarantees that ELBO maximization aligns with MAP trajectory recovery as the mollifier vanishes. The method achieves substantial accuracy and robustness improvements across nonlinear, data-association, range-localization, and high-dimensional manipulation tasks while using far fewer particles than traditional particle- filter–based MAP methods, and it benefits from easy parallelization on modern hardware. The results suggest Stein-MAP-Seq as both a standalone MAP-Seq estimator and a practical initialization front-end for batch MAP optimization, enabling reliable trajectory recovery in challenging multimodal settings with real-time potential.

Abstract

State estimation in robotic systems presents significant challenges, particularly due to the prevalence of multimodal posterior distributions in real-world scenarios. One effective strategy for handling such complexity is to compute maximum a posteriori (MAP) sequences over a discretized or sampled state space, which enables a concise representation of the most likely state trajectory. However, this approach often incurs substantial computational costs, especially in high-dimensional settings. In this article, we propose a novel MAP sequence estimation method, Stein-MAP-Seq, which effectively addresses multimodality while substantially reducing computational and memory overhead. Our key contribution is a sequential variational inference framework that captures temporal dependencies in dynamical system models and integrates Stein variational gradient descent (SVGD) into a Viterbi-style dynamic programming algorithm, enabling computationally efficient MAP sequence estimation. This integration allows the method to focus computational effort on MAP-consistent modes rather than exhaustively exploring the entire state space. Stein-MAP-Seq inherits the parallelism and mode-seeking behavior of SVGD, allowing particle updates to be efficiently executed on parallel hardware and significantly reducing the number of trajectory candidates required for MAP-sequence recursion compared to conventional methods that rely on hundreds to thousands of particles. We validate the proposed approach on a range of highly multimodal scenarios, including nonlinear dynamics with ambiguous observations, unknown data association with outliers, range-only localization under temporary unobservability, and high-dimensional robotic manipulators. Experimental results demonstrate substantial improvements in estimation accuracy and robustness to multimodality over existing estimation methods.

A Computationally Efficient Maximum A Posteriori Sequence Estimation via Stein Variational Inference

TL;DR

This work tackles MAP sequence estimation under multimodal posteriors in robotics by coupling SVGD with a Viterbi-style DP in a sequential variational framework. A two-stage approach first builds a compact, particle-based discretization of time-evolving posteriors via SVGD, then performs forward DP and backtracking to recover a globally optimal MAP trajectory, with theoretical guarantees that ELBO maximization aligns with MAP trajectory recovery as the mollifier vanishes. The method achieves substantial accuracy and robustness improvements across nonlinear, data-association, range-localization, and high-dimensional manipulation tasks while using far fewer particles than traditional particle- filter–based MAP methods, and it benefits from easy parallelization on modern hardware. The results suggest Stein-MAP-Seq as both a standalone MAP-Seq estimator and a practical initialization front-end for batch MAP optimization, enabling reliable trajectory recovery in challenging multimodal settings with real-time potential.

Abstract

State estimation in robotic systems presents significant challenges, particularly due to the prevalence of multimodal posterior distributions in real-world scenarios. One effective strategy for handling such complexity is to compute maximum a posteriori (MAP) sequences over a discretized or sampled state space, which enables a concise representation of the most likely state trajectory. However, this approach often incurs substantial computational costs, especially in high-dimensional settings. In this article, we propose a novel MAP sequence estimation method, Stein-MAP-Seq, which effectively addresses multimodality while substantially reducing computational and memory overhead. Our key contribution is a sequential variational inference framework that captures temporal dependencies in dynamical system models and integrates Stein variational gradient descent (SVGD) into a Viterbi-style dynamic programming algorithm, enabling computationally efficient MAP sequence estimation. This integration allows the method to focus computational effort on MAP-consistent modes rather than exhaustively exploring the entire state space. Stein-MAP-Seq inherits the parallelism and mode-seeking behavior of SVGD, allowing particle updates to be efficiently executed on parallel hardware and significantly reducing the number of trajectory candidates required for MAP-sequence recursion compared to conventional methods that rely on hundreds to thousands of particles. We validate the proposed approach on a range of highly multimodal scenarios, including nonlinear dynamics with ambiguous observations, unknown data association with outliers, range-only localization under temporary unobservability, and high-dimensional robotic manipulators. Experimental results demonstrate substantial improvements in estimation accuracy and robustness to multimodality over existing estimation methods.
Paper Structure (28 sections, 5 theorems, 67 equations, 14 figures, 8 tables, 2 algorithms)

This paper contains 28 sections, 5 theorems, 67 equations, 14 figures, 8 tables, 2 algorithms.

Key Result

Lemma 4.1

(The form of optimal proposal distribution). Let Assumption assump::MAP hold. Then, the optimal proposal distribution $q(x_{0:T})$ in eq::Stein_MAP_opt1 is factorized as

Figures (14)

  • Figure 1: Integration of SVGD-generated sequential particle sets with dynamic programming to yield globally optimal MAP trajectories.
  • Figure 2: (Top) True state trajectory and noisy observations (red dots) with sign ambiguity, resulting in a bimodal posterior distribution. (Middle) Time-indexed grid-based posterior density illustrating the evolution of bimodality. (Bottom) One simulation over the interval ($43$–$62\mathrm{s}$) showing that Stein-MAP-Seq resolves bimodality and recovers the correct trajectory, whereas baseline MAP-Seq estimators exhibit incorrect mode selection.
  • Figure 3: RMSE results over $50$ Monte Carlo simulations with $100$ time steps are reported for a bimodal posterior under high process and measurement uncertainty. Finite-state estimators are generally more robust than Gaussian-assumed estimators in multimodal settings, except for point-wise MAP estimators. Point-wise MAP estimators frequently select incorrect modes and therefore perform worse than MMSE estimators, which effectively average across modes. Among MAP sequence estimators, Stein-MAP-Seq and GTSAM achieve the best performance, whereas PF-based MAP methods exhibit the highest RMSE due to unstable mode selection.
  • Figure 4: The averaged computation time per step is reported over $50$ Monte Carlo simulations. Filtering-based methods are more computationally efficient than MAP-Seq estimators, since MAP-Seq estimation requires trajectory-level inference. PF-MAP-Seq estimators incur the highest computational cost due to the large number of particles required, whereas Stein-MAP-Seq achieves competitive performance with fewer particles and computational cost comparable to GTSAM.
  • Figure 5: (a) A robot follows a counter-clockwise reference trajectory (black dashed curve) while executing noisy motion dynamics at every time step, causing the realized trajectory to gradually deviate from the noise-free reference. At each step, the robot receives a range–bearing measurement from one of four landmarks with unknown data association and heavy-tailed outliers, resulting in a highly multimodal measurement likelihood that varies across locations. (b) Trajectory estimates produced by Gaussian-assumed methods, illustrating bias and inconsistency under nonlinear motion and multimodal measurement likelihoods. (c) PF-based estimators with 500 particles improve robustness compared to Gaussian methods but still exhibit noticeable trajectory distortion due to weight degeneracy and resampling effects. (d) Stein-based estimators with 20 particles accurately recover the reference trajectory while remaining robust to multimodality and outliers, demonstrating reliable estimation with a small number of particles.
  • ...and 9 more figures

Theorems & Definitions (5)

  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Theorem 4.1
  • Lemma 1