Dual Online Stein Variational Inference for Control and Dynamics
Lucas Barcelos, Alexander Lambert, Rafael Oliveira, Paulo Borges, Byron Boots, Fabio Ramos
TL;DR
This work tackles robust model predictive control under both parametric uncertainty and changing environments by formulating MPC as Bayesian inference and introducing Dual Online Stein Variational Inference for Control and Dynamics (DuSt-MPC). It maintains particle-based posteriors over policy parameters \\boldsymbol{\\theta}_{t} and dynamics parameters \\boldsymbol{\\xi}, updating them online via Stein Variational Gradient Descent, with separate, sequential updates for policy and dynamics. The method yields multi-modal posteriors and rapid adaptation to parameter shifts, demonstrated through simulated inverted pendulum and obstacle navigation tasks, plus real-time trajectory tracking on an autonomous ground vehicle. Overall, DuSt-MPC provides a principled, scalable framework to incorporate model uncertainty directly into online control, improving robustness in realistic robotics settings where dynamics change and observations are noisy. The approach leverages a Bernoulli-optimality likelihood exp(-\\alpha C[\\boldsymbol{\\tau}]) and kernelized SVGD updates in an online, sequential setting, enabling efficient, real-time inference.
Abstract
Model predictive control (MPC) schemes have a proven track record for delivering aggressive and robust performance in many challenging control tasks, coping with nonlinear system dynamics, constraints, and observational noise. Despite their success, these methods often rely on simple control distributions, which can limit their performance in highly uncertain and complex environments. MPC frameworks must be able to accommodate changing distributions over system parameters, based on the most recent measurements. In this paper, we devise an implicit variational inference algorithm able to estimate distributions over model parameters and control inputs on-the-fly. The method incorporates Stein Variational gradient descent to approximate the target distributions as a collection of particles, and performs updates based on a Bayesian formulation. This enables the approximation of complex multi-modal posterior distributions, typically occurring in challenging and realistic robot navigation tasks. We demonstrate our approach on both simulated and real-world experiments requiring real-time execution in the face of dynamically changing environments.
