Table of Contents
Fetching ...

Learning a convex cost-to-go for single step model predictive control

E. M. Turan, Z. Mdoe, J. Jäschke

TL;DR

The paper tackles the computational burden of robust MPC on large uncertain systems by learning a convex surrogate of the cost-to-go to enable a horizon of one. It develops two surrogate families—convex interpolants and input-convex neural networks (ICNNs)—and analyzes how to design them to preserve good behavior near the origin and feasibility. The authors provide a theoretical framework for embedding the surrogate into a 1-step MPC formulation, along with practical training procedures and feasibility handling. Empirical results on QCQP and linear MPC case studies show that ICNNs deliver substantial speedups and data efficiency, while interpolants offer competitive accuracy with more inequality constraints, highlighting the practical viability of the approach for reducing online MPC complexity.

Abstract

For large uncertain systems, solving model predictive control problems online can be computationally taxing. Using a shorter prediction horizon can help, but may lead to poor performance and instability without appropriate modifications. This work focuses on learning convex objective terms to enable a single-step control horizon, reducing online computational costs. We consider two surrogates for approximating the cost-to-go: (1) a convex interpolating function and (2) an input-convex neural network. Regardless of the surrogate choice, its behavior near the origin and its ability to describe the feasible region are crucial for the closed-loop performance of the new MPC problem. We address this by tailoring the surrogate to ensure good performance in both aspects. We conclude with numerical examples, in which we compare the convex surrogates to using a standard neural network in the objective, solely using an LQR cost-to-go, and to using a neural network to learn a control policy. The proposed approaches are shown to achieve better performance with less data.

Learning a convex cost-to-go for single step model predictive control

TL;DR

The paper tackles the computational burden of robust MPC on large uncertain systems by learning a convex surrogate of the cost-to-go to enable a horizon of one. It develops two surrogate families—convex interpolants and input-convex neural networks (ICNNs)—and analyzes how to design them to preserve good behavior near the origin and feasibility. The authors provide a theoretical framework for embedding the surrogate into a 1-step MPC formulation, along with practical training procedures and feasibility handling. Empirical results on QCQP and linear MPC case studies show that ICNNs deliver substantial speedups and data efficiency, while interpolants offer competitive accuracy with more inequality constraints, highlighting the practical viability of the approach for reducing online MPC complexity.

Abstract

For large uncertain systems, solving model predictive control problems online can be computationally taxing. Using a shorter prediction horizon can help, but may lead to poor performance and instability without appropriate modifications. This work focuses on learning convex objective terms to enable a single-step control horizon, reducing online computational costs. We consider two surrogates for approximating the cost-to-go: (1) a convex interpolating function and (2) an input-convex neural network. Regardless of the surrogate choice, its behavior near the origin and its ability to describe the feasible region are crucial for the closed-loop performance of the new MPC problem. We address this by tailoring the surrogate to ensure good performance in both aspects. We conclude with numerical examples, in which we compare the convex surrogates to using a standard neural network in the objective, solely using an LQR cost-to-go, and to using a neural network to learn a control policy. The proposed approaches are shown to achieve better performance with less data.
Paper Structure (21 sections, 2 theorems, 30 equations, 11 figures, 1 algorithm)

This paper contains 21 sections, 2 theorems, 30 equations, 11 figures, 1 algorithm.

Key Result

Theorem 1

Let the terminal cost of $\mathbb{P}^{MS}_N$ be chosen such that and let $d_{s,k}$ be chosen such that $\sum_s^S w_sd_{s,k}=0$. Then $\mathcal{V}^{MS}_{LQR}(x)$ is an under-estimator of $\mathcal{V}^{MS}_N(x)$.

Figures (11)

  • Figure 1: Fully branched scenario tree.
  • Figure 2: Illustration that a "good" convex approximator (red line) of $\mathcal{V}_N$ (black line) can be a poor choice of function to minimize due to non-unique minimizers.
  • Figure 3: Schematic of a feedforward input-convex neural network of $M$ layers. For simplicity $z_0$, $W^{(z)}_{0}$, and bias blocks are not shown.
  • Figure 4: Error in the approximation of $u$ using ICNNs and interpolating convex functions.
  • Figure 5: Solve times.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2