Table of Contents
Fetching ...

Zero-Order Optimization for Gaussian Process-based Model Predictive Control

Amon Lahr, Andrea Zanelli, Andrea Carron, Melanie N. Zeilinger

TL;DR

This work employs a tailored Jacobian approximation in a sequential quadratic programming (SQP) approach, and combines it with a parallelizable GP inference and automatic differentiation framework to solve the optimal control problem in real-time.

Abstract

By enabling constraint-aware online model adaptation, model predictive control using Gaussian process (GP) regression has exhibited impressive performance in real-world applications and received considerable attention in the learning-based control community. Yet, solving the resulting optimal control problem in real-time generally remains a major challenge, due to i) the increased number of augmented states in the optimization problem, as well as ii) computationally expensive evaluations of the posterior mean and covariance and their respective derivatives. To tackle these challenges, we employ i) a tailored Jacobian approximation in a sequential quadratic programming (SQP) approach, and combine it with ii) a parallelizable GP inference and automatic differentiation framework. Reducing the numerical complexity with respect to the state dimension $n_x$ for each SQP iteration from $\mathcal{O}(n_x^6)$ to $\mathcal{O}(n_x^3)$, and accelerating GP evaluations on a graphical processing unit, the proposed algorithm computes suboptimal, yet feasible solutions at drastically reduced computation times and exhibits favorable local convergence properties. Numerical experiments verify the scaling properties and investigate the runtime distribution across different parts of the algorithm.

Zero-Order Optimization for Gaussian Process-based Model Predictive Control

TL;DR

This work employs a tailored Jacobian approximation in a sequential quadratic programming (SQP) approach, and combines it with a parallelizable GP inference and automatic differentiation framework to solve the optimal control problem in real-time.

Abstract

By enabling constraint-aware online model adaptation, model predictive control using Gaussian process (GP) regression has exhibited impressive performance in real-world applications and received considerable attention in the learning-based control community. Yet, solving the resulting optimal control problem in real-time generally remains a major challenge, due to i) the increased number of augmented states in the optimization problem, as well as ii) computationally expensive evaluations of the posterior mean and covariance and their respective derivatives. To tackle these challenges, we employ i) a tailored Jacobian approximation in a sequential quadratic programming (SQP) approach, and combine it with ii) a parallelizable GP inference and automatic differentiation framework. Reducing the numerical complexity with respect to the state dimension for each SQP iteration from to , and accelerating GP evaluations on a graphical processing unit, the proposed algorithm computes suboptimal, yet feasible solutions at drastically reduced computation times and exhibits favorable local convergence properties. Numerical experiments verify the scaling properties and investigate the runtime distribution across different parts of the algorithm.
Paper Structure (19 sections, 2 theorems, 31 equations, 3 figures, 2 algorithms)

This paper contains 19 sections, 2 theorems, 31 equations, 3 figures, 2 algorithms.

Key Result

Lemma 1

Let Assumptions ass:strong_regularity and ass:jacobian_diff hold. Denote by $z_+$ a solution to eq:GE_zoro_lin constructed at the linearization point $\hat{z}$. Then, there exist strictly positive constants $\kappa < 1$ and $r_\kappa$, such that, for any $\hat{z} \in {B}(\bar{z},r_\kappa)$, it holds

Figures (3)

  • Figure 1: Infeasibility arising from fixing the covariances based on the previous MPC instance. The predicted state trajectory and covariances from the previous time step are drawn with solid lines in blue; predicted covariances around current predicted state trajectory in red; reference for previous/current time step with dashed lines; infeasible region in light red. When the linearized dynamics at the shooting nodes vary strongly from one time step to another, in this example based on a reference change, fixing the covariances based on the previous MPC instance might lead to significant prediction errors.
  • Figure 2: SQP timings comparison for increasing number of states $n_x$ and GP dimension $n_w = (n_x-3)/2$.
  • Figure 3: Timing profile for Alg. \ref{['alg:gpzoro']} variants for $n_x = 33$ ($n_\text{mass} = 7$).

Theorems & Definitions (4)

  • Remark 1
  • Lemma 1: cf. zanelliContractionEstimatesAbstract2019, Lemma 2
  • Lemma 2
  • proof