Learning-Based Model Predictive Control for Piecewise Affine Systems with Feasibility Guarantees
Samuel Mallick, Azita Dabiri, Bart De Schutter
TL;DR
This work tackles online MPC for piecewise affine systems, where solving for the switching sequence can be computationally prohibitive. It introduces a learning-based MPC that offline trains a classifier to map states to a PWA region sequence, reducing online operation to a linear program while guaranteeing feasibility of the policy output. A convex-partition-based structure, an iterative data-generation procedure, and a tightened-MPC feasibility test underpin the approach, yielding substantial offline and online efficiency gains relative to explicit and online MPC. Numerical results show the method achieves favorable suboptimality concentrated near partition boundaries and significantly lower online computation, enabling scalable real-time control for PWA dynamics with feasibility guarantees.
Abstract
Online model predictive control (MPC) for piecewise affine (PWA) systems requires the online solution to an optimization problem that implicitly optimizes over the switching sequence of PWA regions, for which the computational burden can be prohibitive. Alternatively, the computation can be moved offline using explicit MPC; however, the online memory requirements and the offline computation can then become excessive. In this work we propose a solution in between online and explicit MPC, addressing the above issues by partially dividing the computation between online and offline. To solve the underlying MPC problem, a policy, learned offline, specifies the sequence of PWA regions that the dynamics must follow, thus reducing the complexity of the remaining optimization problem that solves over only the continuous states and control inputs. We provide a condition, verifiable during learning, that guarantees feasibility of the learned policy's output, such that an optimal continuous control input can always be found online. Furthermore, a method for iteratively generating training data offline allows the feasible policy to be learned efficiently, reducing the offline computational burden. A numerical experiment demonstrates the effectiveness of the method compared to both online and explicit MPC.
