Table of Contents
Fetching ...

Deep Long-Short Term Memory networks: Stability properties and Experimental validation

Fabio Bonassi, Alessio La Bella, Giulio Panzani, Marcello Farina, Riccardo Scattolini

TL;DR

It is shown that suitable sufficient conditions on the weights of the network can be leveraged to setup a training procedure able to learn provenly-$\delta$ISS LSTM models from data.

Abstract

The aim of this work is to investigate the use of Incrementally Input-to-State Stable ($δ$ISS) deep Long Short Term Memory networks (LSTMs) for the identification of nonlinear dynamical systems. We show that suitable sufficient conditions on the weights of the network can be leveraged to setup a training procedure able to learn provenly-$δ$ISS LSTM models from data. The proposed approach is tested on a real brake-by-wire apparatus to identify a model of the system from input-output experimentally collected data. Results show satisfactory modeling performances.

Deep Long-Short Term Memory networks: Stability properties and Experimental validation

TL;DR

It is shown that suitable sufficient conditions on the weights of the network can be leveraged to setup a training procedure able to learn provenly-ISS LSTM models from data.

Abstract

The aim of this work is to investigate the use of Incrementally Input-to-State Stable (ISS) deep Long Short Term Memory networks (LSTMs) for the identification of nonlinear dynamical systems. We show that suitable sufficient conditions on the weights of the network can be leveraged to setup a training procedure able to learn provenly-ISS LSTM models from data. The proposed approach is tested on a real brake-by-wire apparatus to identify a model of the system from input-output experimentally collected data. Results show satisfactory modeling performances.
Paper Structure (13 sections, 3 theorems, 27 equations, 7 figures)

This paper contains 13 sections, 3 theorems, 27 equations, 7 figures.

Key Result

Proposition 1

For any layer $l \in \{ 1, ..., L \}$, the squashed input $r_k^{(l)}$ and the gates $f_k^{(l)}$, $i_k^{(l)}$, and $z_k^{(l)}$ are bounded as

Figures (7)

  • Figure 1: Schematic of a deep LSTM architecture, consisting in the concatenation of $L$ LSTM layers, where each layer takes the updated hidden state of the preceding layer as input.
  • Figure 2: Schematic of the brake-by-wire prototype apparatus.
  • Figure 3: Performance testing: input sequence (i.e., the piston position) applied to the deep LSTM model.
  • Figure 4: Performances testing: open-loop prediction of the trained deep LSTM model (red solid line) compared to a linear model (green dash-dotted line) and the ground truth (blue dotted line).
  • Figure 5: Performances testing: LSTM modeling detail during a highly dynamic transient.
  • ...and 2 more figures

Theorems & Definitions (9)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Definition 1: $\mathcal{K}_\infty$ function
  • Definition 2: $\mathcal{KL}$ function
  • Definition 3: $\delta$ISS bayer2013discrete
  • Theorem 1
  • proof