R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks
Nicholas H. Barbara, Ruigang Wang, Ian R. Manchester
TL;DR
R2DNs address the scalability bottleneck of recurrent equilibrium networks by removing the equilibrium layer and replacing the scalar activation with a $1$-Lipschitz DNN, while maintaining contraction and Lipschitz robustness via a direct parameterization. The authors provide a rigorous LTI and IQC-based framework to directly parameterize contracting and $\gamma$-Lipschitz R2DNs, enabling efficient GPU computation and flexible architectural choices. Empirical results across nonlinear system identification, PDE observer design, and learning-based control show that R2DNs match REN performance but with up to an order of magnitude speedups, and with more favorable scaling as expressivity increases. This work enhances the practicality of robust, data-driven controllers and state estimators in high-dimensional settings, offering scalable guarantees and design flexibility for safety-critical applications.
Abstract
This paper presents the Robust Recurrent Deep Network (R2DN), a scalable parameterization of robust recurrent neural networks for machine learning and data-driven control. We construct R2DNs as a feedback interconnection of a linear time-invariant system and a 1-Lipschitz deep feedforward network, and directly parameterize the weights so that our models are stable (contracting) and robust to small input perturbations (Lipschitz) by design. Our parameterization uses a structure similar to the previously-proposed recurrent equilibrium networks (RENs), but without the requirement to iteratively solve an equilibrium layer at each time-step. This speeds up model evaluation and backpropagation on GPUs, and makes it computationally feasible to scale up the network size, batch size, and input sequence length in comparison to RENs. We compare R2DNs to RENs on three representative problems in nonlinear system identification, observer design, and learning-based feedback control and find that training and inference are both up to an order of magnitude faster with similar test set performance, and that training/inference times scale more favorably with respect to model expressivity.
