Tractable Stochastic Hybrid Model Predictive Control using Gaussian Processes for Repetitive Tasks in Unseen Environments
Leroy D'Souza, Yash Vardhan Pant, Sebastian Fischmeister
TL;DR
The paper tackles control under time-varying, multimodal residual dynamics in unseen environments by learning mode distributions with a mode-mapping classifier and replacing intractable hybrid MINLP MPC with two tractable NLP-based approximations. It introduces a likelihood-prior adaptation mechanism that blends mode-residual likelihoods with priors derived from the classifier, enabling online updates as environments change. The proposed NLP-Endo and NLP-Exo controllers retain performance while dramatically improving tractability, supporting longer horizons and safety constraints; adaptive mode-mapping is demonstrated on planar-LTI and 2D quadrotor tasks with up to 3x gains in controller performance over iterations and up to 250x speedups over the baseline MINLP. Overall, the approach enables robust, data-driven MPC in unseen, evolving environments by integrating probabilistic mode inference with tractable control relaxations.
Abstract
Improving the predictive accuracy of a dynamics model is crucial to obtaining good control performance and safety from Model Predictive Controllers (MPC). One approach involves learning unmodelled (residual) dynamics, in addition to nominal models derived from first principles. Varying residual models across an environment manifest as modes of a piecewise residual (PWR) model that requires a) identifying how modes are distributed across the environment and b) solving a computationally intensive Mixed Integer Nonlinear Program (MINLP) problem for control. We develop an iterative mapping algorithm capable of predicting time-varying mode distributions. We then develop and solve two tractable approximations of the MINLP to combine with the predictor in closed-loop to solve the overall control problem. In simulation, we first demonstrate how the approximations improve performance by 4-18% in comparison to the MINLP while achieving significantly lower computation times (upto 250x faster). We then demonstrate how the proposed mapping algorithm incrementally improves controller performance (upto 3x) over multiple iterations of a trajectory tracking control task even when the mode distributions change over time.
