Table of Contents
Fetching ...

Transformer-based Model Predictive Control: Trajectory Optimization via Sequence Modeling

Davide Celestini, Daniele Gammelli, Tommaso Guffanti, Simone D'Amico, Elisa Capello, Marco Pavone

TL;DR

This work presents a unified framework to combine the main strengths of optimization-based and learning-based methods for MPC, embedding high-capacity, transformer-based neural network models within the optimization process for trajectory generation and improving overall MPC runtime by 7x without loss in performance.

Abstract

Model predictive control (MPC) has established itself as the primary methodology for constrained control, enabling general-purpose robot autonomy in diverse real-world scenarios. However, for most problems of interest, MPC relies on the recursive solution of highly non-convex trajectory optimization problems, leading to high computational complexity and strong dependency on initialization. In this work, we present a unified framework to combine the main strengths of optimization-based and learning-based methods for MPC. Our approach entails embedding high-capacity, transformer-based neural network models within the optimization process for trajectory generation, whereby the transformer provides a near-optimal initial guess, or target plan, to a non-convex optimization problem. Our experiments, performed in simulation and the real world onboard a free flyer platform, demonstrate the capabilities of our framework to improve MPC convergence and runtime. Compared to purely optimization-based approaches, results show that our approach can improve trajectory generation performance by up to 75%, reduce the number of solver iterations by up to 45%, and improve overall MPC runtime by 7x without loss in performance.

Transformer-based Model Predictive Control: Trajectory Optimization via Sequence Modeling

TL;DR

This work presents a unified framework to combine the main strengths of optimization-based and learning-based methods for MPC, embedding high-capacity, transformer-based neural network models within the optimization process for trajectory generation and improving overall MPC runtime by 7x without loss in performance.

Abstract

Model predictive control (MPC) has established itself as the primary methodology for constrained control, enabling general-purpose robot autonomy in diverse real-world scenarios. However, for most problems of interest, MPC relies on the recursive solution of highly non-convex trajectory optimization problems, leading to high computational complexity and strong dependency on initialization. In this work, we present a unified framework to combine the main strengths of optimization-based and learning-based methods for MPC. Our approach entails embedding high-capacity, transformer-based neural network models within the optimization process for trajectory generation, whereby the transformer provides a near-optimal initial guess, or target plan, to a non-convex optimization problem. Our experiments, performed in simulation and the real world onboard a free flyer platform, demonstrate the capabilities of our framework to improve MPC convergence and runtime. Compared to purely optimization-based approaches, results show that our approach can improve trajectory generation performance by up to 75%, reduce the number of solver iterations by up to 45%, and improve overall MPC runtime by 7x without loss in performance.

Paper Structure

This paper contains 13 sections, 6 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: We propose a framework to combine high-capacity sequence models and optimization for MPC. The core idea is to train a transformer model to generate near-optimal trajectories (top), which can be used to warm-start optimal control problems (OCPs) at inference (bottom). We leverage methods introduced in GuffantiGammelliEtAl2024 as a pre-training step, whereby a transformer is trained on pre-collected (open-loop) trajectory data (top left) to provide an initial guess for the full OCP (bottom left). For effective MPC, we fine-tune the model (top right) through closed-loop corrections (red) and use it to provide both an initial guess for the OCP and a target state to approximate future cost within the short-sighted problem (bottom right).
  • Figure 2: Visualization of the three scenarios considered in this work. For the tasks of (a) spacecraft rendezvous, (b) quadrotor and (c) free flyer control, we show three example trajectories obtained by warm-starting the SCP through a relaxation of the full OCP (blue) or through our proposed approach (cyan). Keep-out-zones and obstacles are denoted by the red shaded areas.
  • Figure 3: Free flyer testbed and real-world execution of the FT-TTO trajectory in \ref{['fig:scenarios']}c. Transparency identifies the time progression from the start (faded) to the end (opaque) of the trajectory.
  • Figure 4: Percentage improvement in terms of cost suboptimality (top) and number of SCP iterations (bottom) with respect to REL achieved by warm-starting the SCP with FT-TTO and TTO. Each bar represents the improvement averaged over non-convexity factors greater or equal to the corresponding x-axis value.
  • Figure 5: Average normalized cost increment with respect to the problem lower bound.
  • ...and 2 more figures