Table of Contents
Fetching ...

Dual Online Stein Variational Inference for Control and Dynamics

Lucas Barcelos, Alexander Lambert, Rafael Oliveira, Paulo Borges, Byron Boots, Fabio Ramos

TL;DR

This work tackles robust model predictive control under both parametric uncertainty and changing environments by formulating MPC as Bayesian inference and introducing Dual Online Stein Variational Inference for Control and Dynamics (DuSt-MPC). It maintains particle-based posteriors over policy parameters \\boldsymbol{\\theta}_{t} and dynamics parameters \\boldsymbol{\\xi}, updating them online via Stein Variational Gradient Descent, with separate, sequential updates for policy and dynamics. The method yields multi-modal posteriors and rapid adaptation to parameter shifts, demonstrated through simulated inverted pendulum and obstacle navigation tasks, plus real-time trajectory tracking on an autonomous ground vehicle. Overall, DuSt-MPC provides a principled, scalable framework to incorporate model uncertainty directly into online control, improving robustness in realistic robotics settings where dynamics change and observations are noisy. The approach leverages a Bernoulli-optimality likelihood exp(-\\alpha C[\\boldsymbol{\\tau}]) and kernelized SVGD updates in an online, sequential setting, enabling efficient, real-time inference.

Abstract

Model predictive control (MPC) schemes have a proven track record for delivering aggressive and robust performance in many challenging control tasks, coping with nonlinear system dynamics, constraints, and observational noise. Despite their success, these methods often rely on simple control distributions, which can limit their performance in highly uncertain and complex environments. MPC frameworks must be able to accommodate changing distributions over system parameters, based on the most recent measurements. In this paper, we devise an implicit variational inference algorithm able to estimate distributions over model parameters and control inputs on-the-fly. The method incorporates Stein Variational gradient descent to approximate the target distributions as a collection of particles, and performs updates based on a Bayesian formulation. This enables the approximation of complex multi-modal posterior distributions, typically occurring in challenging and realistic robot navigation tasks. We demonstrate our approach on both simulated and real-world experiments requiring real-time execution in the face of dynamically changing environments.

Dual Online Stein Variational Inference for Control and Dynamics

TL;DR

This work tackles robust model predictive control under both parametric uncertainty and changing environments by formulating MPC as Bayesian inference and introducing Dual Online Stein Variational Inference for Control and Dynamics (DuSt-MPC). It maintains particle-based posteriors over policy parameters \\boldsymbol{\\theta}_{t} and dynamics parameters \\boldsymbol{\\xi}, updating them online via Stein Variational Gradient Descent, with separate, sequential updates for policy and dynamics. The method yields multi-modal posteriors and rapid adaptation to parameter shifts, demonstrated through simulated inverted pendulum and obstacle navigation tasks, plus real-time trajectory tracking on an autonomous ground vehicle. Overall, DuSt-MPC provides a principled, scalable framework to incorporate model uncertainty directly into online control, improving robustness in realistic robotics settings where dynamics change and observations are noisy. The approach leverages a Bernoulli-optimality likelihood exp(-\\alpha C[\\boldsymbol{\\tau}]) and kernelized SVGD updates in an online, sequential setting, enabling efficient, real-time inference.

Abstract

Model predictive control (MPC) schemes have a proven track record for delivering aggressive and robust performance in many challenging control tasks, coping with nonlinear system dynamics, constraints, and observational noise. Despite their success, these methods often rely on simple control distributions, which can limit their performance in highly uncertain and complex environments. MPC frameworks must be able to accommodate changing distributions over system parameters, based on the most recent measurements. In this paper, we devise an implicit variational inference algorithm able to estimate distributions over model parameters and control inputs on-the-fly. The method incorporates Stein Variational gradient descent to approximate the target distributions as a collection of particles, and performs updates based on a Bayesian formulation. This enables the approximation of complex multi-modal posterior distributions, typically occurring in challenging and realistic robot navigation tasks. We demonstrate our approach on both simulated and real-world experiments requiring real-time execution in the face of dynamically changing environments.

Paper Structure

This paper contains 20 sections, 30 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Online parameter estimation for autonomous ground vehicles. Distributions over system parameters such as the inertial center of rotation (ICR), are adapted in real-time. (a) The custom built skid-steer robot platform used in experiments. (b) Distribution over $x_{{\operatorname{ICR}}}$ at different time steps. The mass load on the robot is suddenly increased during system execution. The parameter distribution estimate quickly changes to include a second mode that better explains the new dynamics. Our particle-based control scheme can accommodate such multi-modal uncertainty and adapt to dynamically changing environments.
  • Figure 2: Point-mass navigation task. The plots shows trajectories from the start position (red dot) towards the goal (red star). Left: Trajectories executed by SVMPC. Note that, as the mass of the robot changes, the model mismatch causes many of the episodes to crash (x markers). Centre: Trajectories executed by Dust-MPC. Depending on the state of the system when the mass change occurs, a few trajectories deviate from the centre path to avoid collisions. A few trajectories are truncated due to the fixed episode length. Right: Ridge plot of the distribution over mass along several steps of the simulation. The vertical dashed line denotes the true mass. Mass is initially set at 2 kg, and changed to 3 kg at step 100.
  • Figure 3: Inverted pendulum. (a) The image shows the mean cumulative cost over 10 episodes. The shaded region represents the 50% confidence interval. The high variance is expected since each scenario has parameters sampled from a uniform distribution. (b) Plot of the posterior distribution over the pendulum pole-mass at the final step of one of the episodes. The true latent value is shown by the red star marker.
  • Figure 4: AGV trajectory tracking. (a) Raw cost over time. Amount of steps before and after the change of mass are normalised for proper comparison. (b) Trajectories executed by each method. Line style changes when mass changes. Markers denote initial and change of mass position.