Real-Time Online Learning for Model Predictive Control using a Spatio-Temporal Gaussian Process Approximation

Lars Bartels; Amon Lahr; Andrea Carron; Melanie N. Zeilinger

Real-Time Online Learning for Model Predictive Control using a Spatio-Temporal Gaussian Process Approximation

Lars Bartels, Amon Lahr, Andrea Carron, Melanie N. Zeilinger

Abstract

Learning-based model predictive control (MPC) can enhance control performance by correcting for model inaccuracies, enabling more precise state trajectory predictions than traditional MPC. A common approach is to model unknown residual dynamics as a Gaussian process (GP), which leverages data and also provides an estimate of the associated uncertainty. However, the high computational cost of online learning poses a major challenge for real-time GP-MPC applications. This work presents an efficient implementation of an approximate spatio-temporal GP model, offering online learning at constant computational complexity. It is optimized for GP-MPC, where it enables improved control performance by learning more accurate system dynamics online in real-time, even for time-varying systems. The performance of the proposed method is demonstrated by simulations and hardware experiments in the exemplary application of autonomous miniature racing.

Real-Time Online Learning for Model Predictive Control using a Spatio-Temporal Gaussian Process Approximation

Abstract

Paper Structure (19 sections, 14 equations, 6 figures, 3 algorithms)

This paper contains 19 sections, 14 equations, 6 figures, 3 algorithms.

INTRODUCTION
Approximate GP Inference for MPC
Contribution
PROBLEM STATEMENT
Stochastic OCP Formulation
Deterministic OCP Formulation
Zero-Order Optimization
REAL-TIME SPATIO-TEMPORAL GP-MPC
Approximate Spatio-Temporal GP Model
Spatio-Temporal GP Inference for MPC
Python Implementation & Integration into L4acados
AUTONOMOUS MINIATURE RACING
Control Task
Controller Performance Comparison
Computational Performance
...and 4 more sections

Figures (6)

Figure 1: High-speed miniature racing despite a time-varying steering perturbation: Online learning using spatio-temporal Gaussian process approximations enables real-time adaptation to time-varying disturbances. The car aims to race along the track centerline (blue) while staying within track boundaries (red) and therefore plans its future trajectory (yellow) including uncertainty estimates based on the GP model of the residual dynamics.
Figure 2: Steering perturbation mapping parametrized by the neutral steering offset $\delta_0$ (red dot). The shaded area illustrates the set of all employed perturbation mappings, with magenta indicating positive values and green indicating negative values of $\delta_0$.
Figure 3: Experimental GP-MPC solve times using different GP models with thick lines representing a rolling average of the last 100 solves. Left: Exact GP with a SoD approximation using the 400 most recent data points (blue), compared to simulation results using the full dataset (light blue). Right: Proposed approximate spatio-temporal GP model with 80 spatial inducing points based on hardware experiments (green).
Figure 4: One-step MPC prediction error over experiment runtime without a GP (red) and with online learning by an approximate spatio-temporal GP (blue) with a 2-$\sigma$ confidence interval to model the residual dynamics. The bar up top illustrates the time-varying steering perturbation present in the system with the neutral steering offset $\delta_0$ transitioning from zero through positive (green) and negative (magenta) values and back to zero.
Figure 5: Evolution of the most recent lap times over the experiment runtime under a time-varying steering perturbation for nominal MPC (red squares) and GP-MPC with an exact GP model (SoD) (orange diamonds), a spatial conventional inducing-point GP model (SoD) (purple triangles), and an approximate spatio-temporal GP model (blue circles). The exact GP and spatial inducing-point GP employ a subset of data approximation and are conditioned on the most recent 400 data points. Lap times are recorded and plotted at the moment the car has completed a lap. The bar up top illustrates the time-varying steering perturbation present in the system with green corresponding to a positive neutral steering offset $\delta_0$ and magenta conversely indicating a negative value.
...and 1 more figures

Real-Time Online Learning for Model Predictive Control using a Spatio-Temporal Gaussian Process Approximation

Abstract

Real-Time Online Learning for Model Predictive Control using a Spatio-Temporal Gaussian Process Approximation

Authors

Abstract

Table of Contents

Figures (6)