Table of Contents
Fetching ...

Structured state-space models are deep Wiener models

Fabio Bonassi, Carl Andersson, Per Mattsson, Thomas B. Schön

TL;DR

This paper reframes Structured State-space Models (SSMs) as deep Wiener models and surveys their use for long-range sequence tasks and nonlinear system identification. It details discrete-time diagonal and continuous-time Diagonal Plus Low Rank (DPLR) parametrizations, along with initialization strategies and efficient simulation via convolution, parallel scan, and FFT, while emphasizing stability (deltaISS) guarantees through Schur-stable or negative-real-part designs. Through the Silverbox benchmark, it demonstrates that SSMs can achieve competitive accuracy with fewer learnable parameters and highly parallelizable training compared to traditional neural architectures. The work also discusses practical considerations such as aliasing in continuous-time settings and highlights future research directions, including minimal structures, data-driven initializations, and comprehensive benchmarking to guide real-world adoption.

Abstract

The goal of this paper is to provide a system identification-friendly introduction to the Structured State-space Models (SSMs). These models have become recently popular in the machine learning community since, owing to their parallelizability, they can be efficiently and scalably trained to tackle extremely-long sequence classification and regression problems. Interestingly, SSMs appear as an effective way to learn deep Wiener models, which allows to reframe SSMs as an extension of a model class commonly used in system identification. In order to stimulate a fruitful exchange of ideas between the machine learning and system identification communities, we deem it useful to summarize the recent contributions on the topic in a structured and accessible form. At last, we highlight future research directions for which this community could provide impactful contributions.

Structured state-space models are deep Wiener models

TL;DR

This paper reframes Structured State-space Models (SSMs) as deep Wiener models and surveys their use for long-range sequence tasks and nonlinear system identification. It details discrete-time diagonal and continuous-time Diagonal Plus Low Rank (DPLR) parametrizations, along with initialization strategies and efficient simulation via convolution, parallel scan, and FFT, while emphasizing stability (deltaISS) guarantees through Schur-stable or negative-real-part designs. Through the Silverbox benchmark, it demonstrates that SSMs can achieve competitive accuracy with fewer learnable parameters and highly parallelizable training compared to traditional neural architectures. The work also discusses practical considerations such as aliasing in continuous-time settings and highlights future research directions, including minimal structures, data-driven initializations, and comprehensive benchmarking to guide real-world adoption.

Abstract

The goal of this paper is to provide a system identification-friendly introduction to the Structured State-space Models (SSMs). These models have become recently popular in the machine learning community since, owing to their parallelizability, they can be efficiently and scalably trained to tackle extremely-long sequence classification and regression problems. Interestingly, SSMs appear as an effective way to learn deep Wiener models, which allows to reframe SSMs as an extension of a model class commonly used in system identification. In order to stimulate a fruitful exchange of ideas between the machine learning and system identification communities, we deem it useful to summarize the recent contributions on the topic in a structured and accessible form. At last, we highlight future research directions for which this community could provide impactful contributions.
Paper Structure (22 sections, 34 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 22 sections, 34 equations, 2 figures, 1 table, 1 algorithm.

Figures (2)

  • Figure 1: Schematic of a Structured State-space Model.
  • Figure 2: Free-run simulation error of the trained SSMs (black dotted line) with respect to the ground truth (blue line) over the entire test dataset.