Infinity-norm-based Input-to-State-Stable Long Short-Term Memory networks: a thermal systems perspective
Stefano De Carli, Davide Previtali, Leandro Pitturelli, Mirko Mazzoleni, Antonio Ferramosca, Fabio Previdi
TL;DR
The paper addresses stability of RNN-based system identification in thermal processes by deriving a sufficient ISS$_{\infty}$ condition for LSTM networks and proposing an ISS$_{\infty}$-promoted training strategy that includes a stability penalty and early stopping. A layer-wise stability theorem shows that an LSTM layer is ISS$_{\infty}$ when $\bar{\sigma}_f^{(l)} + \bar{\sigma}_i^{(l)} \Vert R_g^{(l)} \Vert_{\infty} < 1$, enabling ISS$_{\infty}$ for the whole network when all layers satisfy this criterion. The ISS$_{\infty}$-promoted training is demonstrated on a thermal shrink-tunnel case study, where the ISS$_{\infty}$ LSTM and ISS$_{\infty}$ GRU outperform a physics-based grey-box model and non-ISS variants, with the LSTM achieving comparable or better accuracy using an order of magnitude fewer parameters. The work highlights the practical value of stability-aware data-driven modeling for safe deployment in thermal control contexts and suggests broader applicability to RNN-based identification tasks.
Abstract
Recurrent Neural Networks (RNNs) have shown remarkable performances in system identification, particularly in nonlinear dynamical systems such as thermal processes. However, stability remains a critical challenge in practical applications: although the underlying process may be intrinsically stable, there may be no guarantee that the resulting RNN model captures this behavior. This paper addresses the stability issue by deriving a sufficient condition for Input-to-State Stability based on the infinity-norm (ISS$_{\infty}$) for Long Short-Term Memory (LSTM) networks. The obtained condition depends on fewer network parameters compared to prior works. A ISS$_{\infty}$-promoted training strategy is developed, incorporating a penalty term in the loss function that encourages stability and an ad hoc early stopping approach. The quality of LSTM models trained via the proposed approach is validated on a thermal system case study, where the ISS$_{\infty}$-promoted LSTM outperforms both a physics-based model and an ISS$_{\infty}$-promoted Gated Recurrent Unit (GRU) network while also surpassing non-ISS$_{\infty}$-promoted LSTM and GRU RNNs.
