Table of Contents
Fetching ...

Tractable Stochastic Hybrid Model Predictive Control using Gaussian Processes for Repetitive Tasks in Unseen Environments

Leroy D'Souza, Yash Vardhan Pant, Sebastian Fischmeister

TL;DR

The paper tackles control under time-varying, multimodal residual dynamics in unseen environments by learning mode distributions with a mode-mapping classifier and replacing intractable hybrid MINLP MPC with two tractable NLP-based approximations. It introduces a likelihood-prior adaptation mechanism that blends mode-residual likelihoods with priors derived from the classifier, enabling online updates as environments change. The proposed NLP-Endo and NLP-Exo controllers retain performance while dramatically improving tractability, supporting longer horizons and safety constraints; adaptive mode-mapping is demonstrated on planar-LTI and 2D quadrotor tasks with up to 3x gains in controller performance over iterations and up to 250x speedups over the baseline MINLP. Overall, the approach enables robust, data-driven MPC in unseen, evolving environments by integrating probabilistic mode inference with tractable control relaxations.

Abstract

Improving the predictive accuracy of a dynamics model is crucial to obtaining good control performance and safety from Model Predictive Controllers (MPC). One approach involves learning unmodelled (residual) dynamics, in addition to nominal models derived from first principles. Varying residual models across an environment manifest as modes of a piecewise residual (PWR) model that requires a) identifying how modes are distributed across the environment and b) solving a computationally intensive Mixed Integer Nonlinear Program (MINLP) problem for control. We develop an iterative mapping algorithm capable of predicting time-varying mode distributions. We then develop and solve two tractable approximations of the MINLP to combine with the predictor in closed-loop to solve the overall control problem. In simulation, we first demonstrate how the approximations improve performance by 4-18% in comparison to the MINLP while achieving significantly lower computation times (upto 250x faster). We then demonstrate how the proposed mapping algorithm incrementally improves controller performance (upto 3x) over multiple iterations of a trajectory tracking control task even when the mode distributions change over time.

Tractable Stochastic Hybrid Model Predictive Control using Gaussian Processes for Repetitive Tasks in Unseen Environments

TL;DR

The paper tackles control under time-varying, multimodal residual dynamics in unseen environments by learning mode distributions with a mode-mapping classifier and replacing intractable hybrid MINLP MPC with two tractable NLP-based approximations. It introduces a likelihood-prior adaptation mechanism that blends mode-residual likelihoods with priors derived from the classifier, enabling online updates as environments change. The proposed NLP-Endo and NLP-Exo controllers retain performance while dramatically improving tractability, supporting longer horizons and safety constraints; adaptive mode-mapping is demonstrated on planar-LTI and 2D quadrotor tasks with up to 3x gains in controller performance over iterations and up to 250x speedups over the baseline MINLP. Overall, the approach enables robust, data-driven MPC in unseen, evolving environments by integrating probabilistic mode inference with tractable control relaxations.

Abstract

Improving the predictive accuracy of a dynamics model is crucial to obtaining good control performance and safety from Model Predictive Controllers (MPC). One approach involves learning unmodelled (residual) dynamics, in addition to nominal models derived from first principles. Varying residual models across an environment manifest as modes of a piecewise residual (PWR) model that requires a) identifying how modes are distributed across the environment and b) solving a computationally intensive Mixed Integer Nonlinear Program (MINLP) problem for control. We develop an iterative mapping algorithm capable of predicting time-varying mode distributions. We then develop and solve two tractable approximations of the MINLP to combine with the predictor in closed-loop to solve the overall control problem. In simulation, we first demonstrate how the approximations improve performance by 4-18% in comparison to the MINLP while achieving significantly lower computation times (upto 250x faster). We then demonstrate how the proposed mapping algorithm incrementally improves controller performance (upto 3x) over multiple iterations of a trajectory tracking control task even when the mode distributions change over time.

Paper Structure

This paper contains 17 sections, 16 equations, 12 figures, 2 tables, 1 algorithm.

Figures (12)

  • Figure 1: An autonomous vehicle subject to different unmodelled (residual) dynamics depending on its position in the workspace as a result of different terrains in the environment.
  • Figure 2: The mode-mapping classifier block involves using a batch trajectory dataset to iteratively improve estimates of the time-varying functions $\delta^{m}(z_k)$\ref{['eq:dynamics']} using the likelihood-prior trade-off scheme outlined in Section \ref{['subsec:mode_mapping']}. The planner block uses a reference trajectory to obtain approximations to quantities which are then used to obtain approximations to the baseline controller \ref{['eq:hybrid_opt_final']}, as further described in Section \ref{['subsec:minlp2nlp']}.
  • Figure 3: A visualization of how each model in $\hat{g}_\text{set}$ infers a Gaussian given a deterministic input. The likelihood of the true measured residual under each mode's Gaussian can be computed.
  • Figure 4: A directed graph showing dependence relations between the variables affecting the residual magnitude. Note $M(y^\delta_k) \equiv M(z_k)$.
  • Figure 5: (a) For a prior of 0.95 and likelihood of 0.05, this graph shows that smaller $\alpha_k(y^\delta_{k})$ results in slower adaptation of the posterior to the change in the likelihood (i.e., mode distribution). (b) A demonstration of how larger $\omega$ yields a smoother $\mathcal{K}(\cdot)$ function.
  • ...and 7 more figures

Theorems & Definitions (1)

  • Remark II.1