Table of Contents
Fetching ...

Approximate non-linear model predictive control with safety-augmented neural networks

Henrik Hose, Johannes Köhler, Melanie N. Zeilinger, Sebastian Trimpe

TL;DR

This work addresses the computational burden of nonlinear MPC by learning an approximate MPC policy with neural networks and augments it with a safety layer that provides deterministic guarantees. The online safety mechanism validates NN-generated input sequences by forward-simulating predictions and switches to a safe fallback when needed, ensuring constraint satisfaction and convergence without requiring an acceptably small approximation error. It also extends the framework to robust MPC by using constraint tightening and training the NN on robust MPC data, yielding deterministic safety even under bounded disturbances. Numerical experiments on three nonlinear MPC benchmarks show speedups of several orders of magnitude over online optimization and demonstrate the importance of the safety augmentation, as naive NN deployment can violate constraints. The findings highlight the practical potential for real-time, deterministic-safe MPC on resource-constrained systems, while acknowledging issues like out-of-distribution generalization and set-valued solutions that warrant further research.

Abstract

Model predictive control (MPC) achieves stability and constraint satisfaction for general nonlinear systems, but requires computationally expensive online optimization. This paper studies approximations of such MPC controllers via neural networks (NNs) to achieve fast online evaluation. We propose safety augmentation that yields deterministic guarantees for convergence and constraint satisfaction despite approximation inaccuracies. We approximate the entire input sequence of the MPC with NNs, which allows us to verify online if it is a feasible solution to the MPC problem. We replace the NN solution by a safe candidate based on standard MPC techniques whenever it is infeasible or has worse cost. Our method requires a single evaluation of the NN and forward integration of the input sequence online, which is fast to compute on resource-constrained systems. The proposed control framework is illustrated using two numerical non-linear MPC benchmarks of different complexity, demonstrating computational speedups that are orders of magnitude higher than online optimization. In the examples, we achieve deterministic safety through the safety-augmented NNs, where a naive NN implementation fails.

Approximate non-linear model predictive control with safety-augmented neural networks

TL;DR

This work addresses the computational burden of nonlinear MPC by learning an approximate MPC policy with neural networks and augments it with a safety layer that provides deterministic guarantees. The online safety mechanism validates NN-generated input sequences by forward-simulating predictions and switches to a safe fallback when needed, ensuring constraint satisfaction and convergence without requiring an acceptably small approximation error. It also extends the framework to robust MPC by using constraint tightening and training the NN on robust MPC data, yielding deterministic safety even under bounded disturbances. Numerical experiments on three nonlinear MPC benchmarks show speedups of several orders of magnitude over online optimization and demonstrate the importance of the safety augmentation, as naive NN deployment can violate constraints. The findings highlight the practical potential for real-time, deterministic-safe MPC on resource-constrained systems, while acknowledging issues like out-of-distribution generalization and set-valued solutions that warrant further research.

Abstract

Model predictive control (MPC) achieves stability and constraint satisfaction for general nonlinear systems, but requires computationally expensive online optimization. This paper studies approximations of such MPC controllers via neural networks (NNs) to achieve fast online evaluation. We propose safety augmentation that yields deterministic guarantees for convergence and constraint satisfaction despite approximation inaccuracies. We approximate the entire input sequence of the MPC with NNs, which allows us to verify online if it is a feasible solution to the MPC problem. We replace the NN solution by a safe candidate based on standard MPC techniques whenever it is infeasible or has worse cost. Our method requires a single evaluation of the NN and forward integration of the input sequence online, which is fast to compute on resource-constrained systems. The proposed control framework is illustrated using two numerical non-linear MPC benchmarks of different complexity, demonstrating computational speedups that are orders of magnitude higher than online optimization. In the examples, we achieve deterministic safety through the safety-augmented NNs, where a naive NN implementation fails.
Paper Structure (23 sections, 3 theorems, 15 equations, 3 figures, 5 tables, 1 algorithm)

This paper contains 23 sections, 3 theorems, 15 equations, 3 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Let Assumptions ass:mpcingredients and ass:initially-feasible hold. Then, the closed-loop system resulting from Algorithm alg:online-validation satisfies $(x(t), u(t)) \in \mathcal{X}\times\mathcal{U}$ for all $t\in\mathbb{N}$ and converges to the state $x=0$.

Figures (3)

  • Figure 1: Approximate MPC with safety-augmented NN. Input sequences from a NN controller are checked by the proposed safe online evaluation (see Alg. \ref{['alg:online-validation']}). If they yield constraint satisfaction and stability, they are applied to the system. Otherwise, a safe candidate obtained from the last time step -- shifted with appended terminal controller -- is used as fallback.
  • Figure 2: Exemplary closed-loop simulation of naive and safety augmented NN. Naively (always) applying $\Pi_\text{NN}$ (dotted line) for 1.1s, the quadcopter crashes ($\textcolor{red}{\times}$) into the wall at $x_1=0.145m$. With the safety-augmented NN presented in Algorithm \ref{['alg:online-validation']} (solid), a safe candidate is chosen over unsafe $\Pi_\text{NN}$ predictions for three time steps ($\bullet$) and the system adheres to constraints.
  • Figure 3: Franka Panda robot arm environment (left) and closed-loop simulation with velocity-proportional disturbances (right). Vertical lines indicate setpoint changes. The safety augmented NN (ours) shown in blue avoids collision while the naive NN in red collides right before the 2s mark. The control performance of the NN controllers is close to the data-generating MPC with 10 solver iterations that take 200ms, which is too slow to run in real-time.

Theorems & Definitions (9)

  • Remark 1
  • Theorem 1
  • proof
  • Remark 2
  • Lemma 2
  • proof
  • Remark 3
  • Lemma 3
  • proof