Table of Contents
Fetching ...

R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks

Nicholas H. Barbara, Ruigang Wang, Ian R. Manchester

TL;DR

R2DNs address the scalability bottleneck of recurrent equilibrium networks by removing the equilibrium layer and replacing the scalar activation with a $1$-Lipschitz DNN, while maintaining contraction and Lipschitz robustness via a direct parameterization. The authors provide a rigorous LTI and IQC-based framework to directly parameterize contracting and $\gamma$-Lipschitz R2DNs, enabling efficient GPU computation and flexible architectural choices. Empirical results across nonlinear system identification, PDE observer design, and learning-based control show that R2DNs match REN performance but with up to an order of magnitude speedups, and with more favorable scaling as expressivity increases. This work enhances the practicality of robust, data-driven controllers and state estimators in high-dimensional settings, offering scalable guarantees and design flexibility for safety-critical applications.

Abstract

This paper presents the Robust Recurrent Deep Network (R2DN), a scalable parameterization of robust recurrent neural networks for machine learning and data-driven control. We construct R2DNs as a feedback interconnection of a linear time-invariant system and a 1-Lipschitz deep feedforward network, and directly parameterize the weights so that our models are stable (contracting) and robust to small input perturbations (Lipschitz) by design. Our parameterization uses a structure similar to the previously-proposed recurrent equilibrium networks (RENs), but without the requirement to iteratively solve an equilibrium layer at each time-step. This speeds up model evaluation and backpropagation on GPUs, and makes it computationally feasible to scale up the network size, batch size, and input sequence length in comparison to RENs. We compare R2DNs to RENs on three representative problems in nonlinear system identification, observer design, and learning-based feedback control and find that training and inference are both up to an order of magnitude faster with similar test set performance, and that training/inference times scale more favorably with respect to model expressivity.

R2DN: Scalable Parameterization of Contracting and Lipschitz Recurrent Deep Networks

TL;DR

R2DNs address the scalability bottleneck of recurrent equilibrium networks by removing the equilibrium layer and replacing the scalar activation with a -Lipschitz DNN, while maintaining contraction and Lipschitz robustness via a direct parameterization. The authors provide a rigorous LTI and IQC-based framework to directly parameterize contracting and -Lipschitz R2DNs, enabling efficient GPU computation and flexible architectural choices. Empirical results across nonlinear system identification, PDE observer design, and learning-based control show that R2DNs match REN performance but with up to an order of magnitude speedups, and with more favorable scaling as expressivity increases. This work enhances the practicality of robust, data-driven controllers and state estimators in high-dimensional settings, offering scalable guarantees and design flexibility for safety-critical applications.

Abstract

This paper presents the Robust Recurrent Deep Network (R2DN), a scalable parameterization of robust recurrent neural networks for machine learning and data-driven control. We construct R2DNs as a feedback interconnection of a linear time-invariant system and a 1-Lipschitz deep feedforward network, and directly parameterize the weights so that our models are stable (contracting) and robust to small input perturbations (Lipschitz) by design. Our parameterization uses a structure similar to the previously-proposed recurrent equilibrium networks (RENs), but without the requirement to iteratively solve an equilibrium layer at each time-step. This speeds up model evaluation and backpropagation on GPUs, and makes it computationally feasible to scale up the network size, batch size, and input sequence length in comparison to RENs. We compare R2DNs to RENs on three representative problems in nonlinear system identification, observer design, and learning-based feedback control and find that training and inference are both up to an order of magnitude faster with similar test set performance, and that training/inference times scale more favorably with respect to model expressivity.

Paper Structure

This paper contains 16 sections, 2 theorems, 38 equations, 4 figures, 1 table.

Key Result

Proposition 1

Suppose that Assumption asmp:phi holds, and eqn:r2dn-lti is contracting and admits the incremental IQC defined by with $0 \succ Q \in \mathbb{R}^{p\times p}$, $S\in \mathbb{R}^{m\times p}$ and $R=R^\top \in \mathbb{R}^{m\times m}$. Then the system eqn:r2dn is contracting and admits the incremental IQC defined by $(Q,S,R)$.

Figures (4)

  • Figure 1: Block diagrams for the REN and the proposed R2DN architectures. We replace the scalar activation function $\sigma$ with a 1-Lipschitz feedforward network, and modify the LTI system $G$ to remove direct feedthrough from $w \rightarrow v$.
  • Figure 2: The function $f(x,u)$ to be fitted.
  • Figure 3: Computation time for the forwards (a) and backwards (b) passes as functions of model expressivity for the RENs and R2DNs. Computation time scales more favorably for the R2DN models. Error bars show one standard-deviation across 5 random model initializations for each data point. Slope standard deviations are in parentheses.
  • Figure 4: Mean loss curves as a function of training time for each of the three benchmark problems. Bands show the loss range over 10 random model initializations. Note that the first training step also includes the overhead from just-in-time compilation in JAX.

Theorems & Definitions (12)

  • Definition 1
  • Definition 2
  • Definition 3
  • Proposition 1
  • proof
  • Remark 1
  • Remark 2
  • Remark 3
  • Proposition 2
  • Remark 4
  • ...and 2 more