Table of Contents
Fetching ...

L2RU: a Structured State Space Model with prescribed L2-bound

Leonardo Massai, Muhammad Zakwan, Giancarlo Ferrari-Trecate

TL;DR

L2RU introduces an L2-bounded structured state-space model to guarantee input–output stability across all parameter values. It provides two free parametrizations for discrete-time LTI subsystems—one complete for square systems and a second efficient, non-square variant—forming the backbone of an L2-bounded SSM layer that can be trained unconstrained with stability guarantees. The framework includes a long-memory initialization strategy and a formal composition to yield overall L2 stability, validated on nonlinear system identification benchmarks where it outperforms or matches existing SSM architectures while training faster. This work offers a principled, robust building block for learning-based control and identification tasks, with broad implications for reliable long-sequence modeling.

Abstract

Structured state-space models (SSMs) have recently emerged as a powerful architecture at the intersection of machine learning and control, featuring layers composed of discrete-time linear time-invariant (LTI) systems followed by pointwise nonlinearities. These models combine the expressiveness of deep neural networks with the interpretability and inductive bias of dynamical systems, offering strong performance on long-sequence tasks with favorable computational complexity. However, their adoption in applications such as system identification and optimal control remains limited by the difficulty of enforcing stability and robustness in a principled and tractable manner. We introduce L2RU, a class of SSMs endowed with a prescribed $\mathcal{L}_2$-gain bound, guaranteeing input--output stability and robustness for all parameter values. The L2RU architecture is derived from free parametrizations of LTI systems satisfying an $\mathcal{L}_2$ constraint, enabling unconstrained optimization via standard gradient-based methods while preserving rigorous stability guarantees. Specifically, we develop two complementary parametrizations: a non-conservative formulation that provides a complete characterization of square LTI systems with a given $\mathcal{L}_2$-bound, and a conservative formulation that extends the approach to general (possibly non-square) systems while improving computational efficiency through a structured representation of the system matrices. Both parametrizations admit efficient initialization schemes that facilitate training long-memory models. We demonstrate the effectiveness of the proposed framework on a nonlinear system identification benchmark, where L2RU achieves improved performance and training stability compared to existing SSM architectures, highlighting its potential as a principled and robust building block for learning and control.

L2RU: a Structured State Space Model with prescribed L2-bound

TL;DR

L2RU introduces an L2-bounded structured state-space model to guarantee input–output stability across all parameter values. It provides two free parametrizations for discrete-time LTI subsystems—one complete for square systems and a second efficient, non-square variant—forming the backbone of an L2-bounded SSM layer that can be trained unconstrained with stability guarantees. The framework includes a long-memory initialization strategy and a formal composition to yield overall L2 stability, validated on nonlinear system identification benchmarks where it outperforms or matches existing SSM architectures while training faster. This work offers a principled, robust building block for learning-based control and identification tasks, with broad implications for reliable long-sequence modeling.

Abstract

Structured state-space models (SSMs) have recently emerged as a powerful architecture at the intersection of machine learning and control, featuring layers composed of discrete-time linear time-invariant (LTI) systems followed by pointwise nonlinearities. These models combine the expressiveness of deep neural networks with the interpretability and inductive bias of dynamical systems, offering strong performance on long-sequence tasks with favorable computational complexity. However, their adoption in applications such as system identification and optimal control remains limited by the difficulty of enforcing stability and robustness in a principled and tractable manner. We introduce L2RU, a class of SSMs endowed with a prescribed -gain bound, guaranteeing input--output stability and robustness for all parameter values. The L2RU architecture is derived from free parametrizations of LTI systems satisfying an constraint, enabling unconstrained optimization via standard gradient-based methods while preserving rigorous stability guarantees. Specifically, we develop two complementary parametrizations: a non-conservative formulation that provides a complete characterization of square LTI systems with a given -bound, and a conservative formulation that extends the approach to general (possibly non-square) systems while improving computational efficiency through a structured representation of the system matrices. Both parametrizations admit efficient initialization schemes that facilitate training long-memory models. We demonstrate the effectiveness of the proposed framework on a nonlinear system identification benchmark, where L2RU achieves improved performance and training stability compared to existing SSM architectures, highlighting its potential as a principled and robust building block for learning and control.

Paper Structure

This paper contains 14 sections, 5 theorems, 45 equations, 4 figures, 2 tables.

Key Result

Proposition 1

(DT Real Bounded Lemmacaverly_lmi_2024) Let $g_{(A,B,C,D)} : \mathcal{L}^{n_u} \mapsto \mathcal{L}^{n_y}$ be a DT LTI system described in state space. $g$ has finite $\mathcal{L}_2$-gain if and only if $\exists \: P \succ 0, \gamma >0$ such that: or, equivalently: Moreover, the $\mathcal{L}_2$-gain of $g$ is equal to the infimum among all $\gamma$ such that eq:realbounded, eq:realbounded2 is sat

Figures (4)

  • Figure 1: L2RU architecture presented in this paper. The model consists of a series of state-space layers, each comprised of $\mathcal{L}_2$-bounded DT LTI systems and Lipschitz-bounded nonlinearities. The input/output is pre- and post-processed by linear transformations.
  • Figure 2: Triple-tank system with recirculation pump.
  • Figure 3: (a) Comparison of the open-loop prediction of the trained distributed L2RU versus ground truth on an independent validation dataset. For the sake of legibility, we only show the first 600 time steps. (b) Comparison between training losses obtained with the initialization of Proposition \ref{['prop:init']} and with a random initialization.
  • Figure 4: Validation loss versus number of parameters for three models.

Theorems & Definitions (10)

  • Definition 1
  • Proposition 1
  • Definition 2
  • Remark 1
  • Theorem 1
  • Remark 2
  • Remark 3
  • Proposition 2
  • Theorem 2
  • Theorem 3