Learning the cost-to-go for mixed-integer nonlinear model predictive control

Christopher A. Orrico; W. P. M. H. Heemels; Dinesh Krishnamoorthy

Learning the cost-to-go for mixed-integer nonlinear model predictive control

Christopher A. Orrico, W. P. M. H. Heemels, Dinesh Krishnamoorthy

TL;DR

The paper addresses the real-time solvability challenge of mixed-integer nonlinear MPC (MINMPC) for hybrid systems by replacing the long-horizon value function with an offline-learned convex cost-to-go $V(x)= x^{\mathsf{T}} P x$, enabling online optimization with a one-step horizon. The key idea is to learn $P$ via inverse optimization from offline expert trajectories, formulated as a semidefinite program that enforces $P\succeq 0$ and near-KKT consistency, so the online problem minimizes $\ell(x(t),u,z) + V(f(x(t),u,z))$ subject to $g(x(t),u,z)\le 0$. The approach is demonstrated on a Lotka-Volterra fishing problem with discrete control, showing that the IOC-imputed cost-to-go yields near the same performance as the full-horizon MINMPC but with enormous online speedups (e.g., 217 s vs 54.1 ms per decision). This enables real-time application of MINMPC to complex hybrid systems, while the work also outlines directions to broaden the cost-to-go beyond quadratic forms and to establish stability and feasibility guarantees.

Abstract

Application of nonlinear model predictive control (NMPC) to problems with hybrid dynamical systems, disjoint constraints, or discrete controls often results in mixed-integer formulations with both continuous and discrete decision variables. However, solving mixed-integer nonlinear programming problems (MINLP) in real-time is challenging, which can be a limiting factor in many applications. To address the computational complexity of solving mixed integer nonlinear model predictive control problem in real-time, this paper proposes an approximate mixed integer NMPC formulation based on value function approximation. Leveraging Bellman's principle of optimality, the key idea here is to divide the prediction horizon into two parts, where the optimal value function of the latter part of the prediction horizon is approximated offline using expert demonstrations. Doing so allows us to solve the MINMPC problem with a considerably shorter prediction horizon online, thereby reducing the online computation cost. The paper uses an inverted pendulum example with discrete controls to illustrate this approach.

Learning the cost-to-go for mixed-integer nonlinear model predictive control

TL;DR

, enabling online optimization with a one-step horizon. The key idea is to learn

via inverse optimization from offline expert trajectories, formulated as a semidefinite program that enforces

and near-KKT consistency, so the online problem minimizes

subject to

. The approach is demonstrated on a Lotka-Volterra fishing problem with discrete control, showing that the IOC-imputed cost-to-go yields near the same performance as the full-horizon MINMPC but with enormous online speedups (e.g., 217 s vs 54.1 ms per decision). This enables real-time application of MINMPC to complex hybrid systems, while the work also outlines directions to broaden the cost-to-go beyond quadratic forms and to establish stability and feasibility guarantees.

Abstract

Paper Structure (5 sections, 8 equations, 1 figure)

This paper contains 5 sections, 8 equations, 1 figure.

Problem formulation
Illustrative example
Imputing the cost-to-go offline
Online Controller Performance
Conclusion

Figures (1)

Figure 1: (a) Prey and (b) predator state population for the three training solution set trajectories (in light blue) computed offline (without plant-model mismatch and measurement noise), the trajectory computed with MINMPC controller (in blue), and the trajectory computed with the myopic MPC controller (in red). The reference is indicated by a black, dashed line. (c) The control decisions for both the training sets and the online controllers. (d) The computation time per controller decision.

Learning the cost-to-go for mixed-integer nonlinear model predictive control

TL;DR

Abstract

Learning the cost-to-go for mixed-integer nonlinear model predictive control

Authors

TL;DR

Abstract

Table of Contents

Figures (1)