Table of Contents
Fetching ...

Infinity-norm-based Input-to-State-Stable Long Short-Term Memory networks: a thermal systems perspective

Stefano De Carli, Davide Previtali, Leandro Pitturelli, Mirko Mazzoleni, Antonio Ferramosca, Fabio Previdi

TL;DR

The paper addresses stability of RNN-based system identification in thermal processes by deriving a sufficient ISS$_{\infty}$ condition for LSTM networks and proposing an ISS$_{\infty}$-promoted training strategy that includes a stability penalty and early stopping. A layer-wise stability theorem shows that an LSTM layer is ISS$_{\infty}$ when $\bar{\sigma}_f^{(l)} + \bar{\sigma}_i^{(l)} \Vert R_g^{(l)} \Vert_{\infty} < 1$, enabling ISS$_{\infty}$ for the whole network when all layers satisfy this criterion. The ISS$_{\infty}$-promoted training is demonstrated on a thermal shrink-tunnel case study, where the ISS$_{\infty}$ LSTM and ISS$_{\infty}$ GRU outperform a physics-based grey-box model and non-ISS variants, with the LSTM achieving comparable or better accuracy using an order of magnitude fewer parameters. The work highlights the practical value of stability-aware data-driven modeling for safe deployment in thermal control contexts and suggests broader applicability to RNN-based identification tasks.

Abstract

Recurrent Neural Networks (RNNs) have shown remarkable performances in system identification, particularly in nonlinear dynamical systems such as thermal processes. However, stability remains a critical challenge in practical applications: although the underlying process may be intrinsically stable, there may be no guarantee that the resulting RNN model captures this behavior. This paper addresses the stability issue by deriving a sufficient condition for Input-to-State Stability based on the infinity-norm (ISS$_{\infty}$) for Long Short-Term Memory (LSTM) networks. The obtained condition depends on fewer network parameters compared to prior works. A ISS$_{\infty}$-promoted training strategy is developed, incorporating a penalty term in the loss function that encourages stability and an ad hoc early stopping approach. The quality of LSTM models trained via the proposed approach is validated on a thermal system case study, where the ISS$_{\infty}$-promoted LSTM outperforms both a physics-based model and an ISS$_{\infty}$-promoted Gated Recurrent Unit (GRU) network while also surpassing non-ISS$_{\infty}$-promoted LSTM and GRU RNNs.

Infinity-norm-based Input-to-State-Stable Long Short-Term Memory networks: a thermal systems perspective

TL;DR

The paper addresses stability of RNN-based system identification in thermal processes by deriving a sufficient ISS condition for LSTM networks and proposing an ISS-promoted training strategy that includes a stability penalty and early stopping. A layer-wise stability theorem shows that an LSTM layer is ISS when , enabling ISS for the whole network when all layers satisfy this criterion. The ISS-promoted training is demonstrated on a thermal shrink-tunnel case study, where the ISS LSTM and ISS GRU outperform a physics-based grey-box model and non-ISS variants, with the LSTM achieving comparable or better accuracy using an order of magnitude fewer parameters. The work highlights the practical value of stability-aware data-driven modeling for safe deployment in thermal control contexts and suggests broader applicability to RNN-based identification tasks.

Abstract

Recurrent Neural Networks (RNNs) have shown remarkable performances in system identification, particularly in nonlinear dynamical systems such as thermal processes. However, stability remains a critical challenge in practical applications: although the underlying process may be intrinsically stable, there may be no guarantee that the resulting RNN model captures this behavior. This paper addresses the stability issue by deriving a sufficient condition for Input-to-State Stability based on the infinity-norm (ISS) for Long Short-Term Memory (LSTM) networks. The obtained condition depends on fewer network parameters compared to prior works. A ISS-promoted training strategy is developed, incorporating a penalty term in the loss function that encourages stability and an ad hoc early stopping approach. The quality of LSTM models trained via the proposed approach is validated on a thermal system case study, where the ISS-promoted LSTM outperforms both a physics-based model and an ISS-promoted Gated Recurrent Unit (GRU) network while also surpassing non-ISS-promoted LSTM and GRU RNNs.

Paper Structure

This paper contains 10 sections, 2 theorems, 39 equations, 3 figures, 1 table.

Key Result

Theorem 1

The $l$-th LSTM layer in eq:LSTM_layer_cell_state_update/eq:LSTM_layer_hidden_state_update, $l \in \{1, \ldots, L\}$, is ISS$_{\infty}$ if the following sufficient condition holds: where, for $j \in \{f, i, o\}$, we have:

Figures (3)

  • Figure 1: Schematic of the considered shrink tunnel. In zone $z, z \in \{1, 2\}$, the heat resistors marked by SSR$z$ are driven by the same solid-state relay while those denoted by EMR$z$ are managed by the electromechanical relay.
  • Figure 2: Box plot of the performance index in \ref{['eq:fits']} for each model on test data. The Figure also reports the median fits.
  • Figure 3: Comparison of temperature $y_3$ predictions on the test trial focusing on fan frequency changes and pack disturbances. The green vertical stripes denote when $d_{\mathrm{p}, k} = 1$.

Theorems & Definitions (7)

  • Definition 1: Input-to-state stability (ISS$_{\infty}$) bonassi_stability_2021
  • Theorem 1: ISS$_{\infty}$ for an LSTM layer
  • Remark 1
  • Theorem 2: ISS$_{\infty}$ for an LSTM network
  • Remark 2
  • Remark 3
  • Remark 4