A scalable, gradient-stable approach to multi-step, nonlinear system identification using first-order methods
Cesare Donati, Martina Mammarella, Fabrizio Dabbene, Carlo Novara, Constantino Lagoa
TL;DR
The paper proposes a scalable gradient-based framework for nonlinear multi-step system identification that leverages automatic differentiation to compute exact gradients along a prediction horizon. By modeling the gradient evolution as a Linear Parameter-Varying (LPV) dynamical system, it achieves computational complexity that scales linearly with the horizon length $T$ and the number of parameters, specifically $\mathcal{O}(T n_x^2 n_\vartheta)$. A key contribution is a BIBO-based stability analysis that provides conditions to avoid exploding gradients during identification, including barrier-penalty techniques to keep trajectories within stable regions. The approach is validated through numerical examples demonstrating substantial computational gains and effective mitigation of gradient explosion, with implications for scalable nonlinear identification and non-convex optimization in dynamical systems.
Abstract
This paper presents three main contributions to the field of multi-step system identification. First, drawing inspiration from Neural Network (NN) training, it introduces a tool for solving identification problems by leveraging first-order optimization and Automatic Differentiation (AD). The proposed method exploits gradients with respect to the parameters to be identified and leverages Linear Parameter-Varying (LPV) sensitivity equations to model gradient evolution. Second, it demonstrates that the computational complexity of the proposed method is linear in both the multi-step horizon length and the parameter size, ensuring scalability for large identification problems. Third, it formally addresses the "exploding gradient" issue: via a stability analysis of the LPV equations, it derives conditions for a reliable and efficient optimization and identification process for dynamical systems. Simulation results indicate that the proposed method is both effective and efficient, making it a promising tool for future research and applications in nonlinear system identification and non-convex optimization.
