A scalable, gradient-stable approach to multi-step, nonlinear system identification using first-order methods

Cesare Donati; Martina Mammarella; Fabrizio Dabbene; Carlo Novara; Constantino Lagoa

A scalable, gradient-stable approach to multi-step, nonlinear system identification using first-order methods

Cesare Donati, Martina Mammarella, Fabrizio Dabbene, Carlo Novara, Constantino Lagoa

TL;DR

The paper proposes a scalable gradient-based framework for nonlinear multi-step system identification that leverages automatic differentiation to compute exact gradients along a prediction horizon. By modeling the gradient evolution as a Linear Parameter-Varying (LPV) dynamical system, it achieves computational complexity that scales linearly with the horizon length $T$ and the number of parameters, specifically $\mathcal{O}(T n_x^2 n_\vartheta)$. A key contribution is a BIBO-based stability analysis that provides conditions to avoid exploding gradients during identification, including barrier-penalty techniques to keep trajectories within stable regions. The approach is validated through numerical examples demonstrating substantial computational gains and effective mitigation of gradient explosion, with implications for scalable nonlinear identification and non-convex optimization in dynamical systems.

Abstract

This paper presents three main contributions to the field of multi-step system identification. First, drawing inspiration from Neural Network (NN) training, it introduces a tool for solving identification problems by leveraging first-order optimization and Automatic Differentiation (AD). The proposed method exploits gradients with respect to the parameters to be identified and leverages Linear Parameter-Varying (LPV) sensitivity equations to model gradient evolution. Second, it demonstrates that the computational complexity of the proposed method is linear in both the multi-step horizon length and the parameter size, ensuring scalability for large identification problems. Third, it formally addresses the "exploding gradient" issue: via a stability analysis of the LPV equations, it derives conditions for a reliable and efficient optimization and identification process for dynamical systems. Simulation results indicate that the proposed method is both effective and efficient, making it a promising tool for future research and applications in nonlinear system identification and non-convex optimization.

A scalable, gradient-stable approach to multi-step, nonlinear system identification using first-order methods

TL;DR

and the number of parameters, specifically

. A key contribution is a BIBO-based stability analysis that provides conditions to avoid exploding gradients during identification, including barrier-penalty techniques to keep trajectories within stable regions. The approach is validated through numerical examples demonstrating substantial computational gains and effective mitigation of gradient explosion, with implications for scalable nonlinear identification and non-convex optimization in dynamical systems.

Abstract

Paper Structure (19 sections, 5 theorems, 26 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 19 sections, 5 theorems, 26 equations, 2 figures, 1 table, 1 algorithm.

Introduction
Preliminaries
Problem setup
Multi-step identification
Automatic differentiation in multi-step system identification
Gradient LPV equations
Proposed approach
Computational complexity
Non-exploding gradient
Numerical example
Complexity comparison
Exploding gradient: a population dynamics example
Conclusions
Proofs
Proof to Proposition \ref{['prop1']}
...and 4 more sections

Key Result

Proposition 1

Define the memory matrix as the matrix containing the total derivatives of the states with respect to $\theta$. Define vectors $\rho_k \in \mathbb R^{n_x}$, $\varrho_k \in \mathbb R^{n_\theta}$ as Then, the gradient evolution with respect to the parameter $\theta$ over the multi-step horizon $T$ is described by the following time-varying dynamical system for $k=1,\dots,T$, with $\Lambda_{0} \dot

Figures (2)

Figure 1: Multi-step model propagation.
Figure 2: Effect of barrier functions on mitigating exploding gradients for $N=50$ different system initial conditions, represented with $\pm1$ standard deviation bands around the mean trajectories.

Theorems & Definitions (10)

Remark 1: on the gradient LPV equations
Proposition 1: gradient dynamics -- $\theta$
Proposition 2: gradient dynamics -- $x_0$
Theorem 1: complexity analysis
Remark 2: backward AD in system identification
Remark 3: complexity improvement
Definition 1: non-exploding gradient
Theorem 2: gradient stability
Corollary 1: trajectory to gradient stability
Remark 4: trajectory stability via state barriers

A scalable, gradient-stable approach to multi-step, nonlinear system identification using first-order methods

TL;DR

Abstract

A scalable, gradient-stable approach to multi-step, nonlinear system identification using first-order methods

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (10)