Learning stabilising policies for constrained nonlinear systems
Daniele Ravasio, Danilo Saccani, Marcello Farina, Giancarlo Ferrari-Trecate
TL;DR
The paper tackles constrained nonlinear control for systems modeled by recurrent neural networks under additive disturbances, where both input and output constraints must be satisfied. It introduces a two-layer architecture: a stabilising base controller guarantees $\delta$ISS and constraint satisfaction within a robustly positive invariant set, and a performance-boosting internal-model-control layer, implemented as a stable neural operator, optimises closed-loop performance via unconstrained learning while preserving stability and safety. The stabilization is achieved through an LMI-based synthesis that yields a gain $K$ and an RPI set, ensuring safety bounds; the boosting layer uses a projected IMC with a learnable operator $\mathbfcal M(\theta)$ parameterised by a Recurrent Equilibrium Network, trained over disturbance samples and projected onto a safe boosting set. A key theoretical result shows that all feasible, $\mathcal{L}_p$-stable closed-loop maps can be realized within the proposed framework, and stability is preserved even when the boosting optimization is stopped early. The approach is validated on a pH-neutralisation benchmark, demonstrating effective tracking, constraint adherence, and notable improvements in performance metrics, illustrating practical applicability to safety-critical constrained nonlinear control tasks.
Abstract
This work proposes a two-layered control scheme for constrained nonlinear systems represented by a class of recurrent neural networks and affected by additive disturbances. In particular, a base controller ensures global or regional closed-loop l_p-stability of the error in tracking a desired equilibrium and the satisfaction of input and output constraints within a robustly positive invariant set. An additional control contribution, derived by combining the internal model control principle with a stable operator, is introduced to improve system performance. This operator, implemented as a stable neural network, can be trained via unconstrained optimisation on a chosen performance metric, without compromising closed-loop equilibrium tracking or constraint satisfaction, even if the optimisation is stopped prematurely. In addition, we characterise the class of closed-loop stable behaviours that can be achieved with the proposed architecture. Simulation results on a pH-neutralisation benchmark demonstrate the effectiveness of the proposed approach.
