Table of Contents
Fetching ...

A scalable, gradient-stable approach to multi-step, nonlinear system identification using first-order methods

Cesare Donati, Martina Mammarella, Fabrizio Dabbene, Carlo Novara, Constantino Lagoa

TL;DR

The paper proposes a scalable gradient-based framework for nonlinear multi-step system identification that leverages automatic differentiation to compute exact gradients along a prediction horizon. By modeling the gradient evolution as a Linear Parameter-Varying (LPV) dynamical system, it achieves computational complexity that scales linearly with the horizon length $T$ and the number of parameters, specifically $\mathcal{O}(T n_x^2 n_\vartheta)$. A key contribution is a BIBO-based stability analysis that provides conditions to avoid exploding gradients during identification, including barrier-penalty techniques to keep trajectories within stable regions. The approach is validated through numerical examples demonstrating substantial computational gains and effective mitigation of gradient explosion, with implications for scalable nonlinear identification and non-convex optimization in dynamical systems.

Abstract

This paper presents three main contributions to the field of multi-step system identification. First, drawing inspiration from Neural Network (NN) training, it introduces a tool for solving identification problems by leveraging first-order optimization and Automatic Differentiation (AD). The proposed method exploits gradients with respect to the parameters to be identified and leverages Linear Parameter-Varying (LPV) sensitivity equations to model gradient evolution. Second, it demonstrates that the computational complexity of the proposed method is linear in both the multi-step horizon length and the parameter size, ensuring scalability for large identification problems. Third, it formally addresses the "exploding gradient" issue: via a stability analysis of the LPV equations, it derives conditions for a reliable and efficient optimization and identification process for dynamical systems. Simulation results indicate that the proposed method is both effective and efficient, making it a promising tool for future research and applications in nonlinear system identification and non-convex optimization.

A scalable, gradient-stable approach to multi-step, nonlinear system identification using first-order methods

TL;DR

The paper proposes a scalable gradient-based framework for nonlinear multi-step system identification that leverages automatic differentiation to compute exact gradients along a prediction horizon. By modeling the gradient evolution as a Linear Parameter-Varying (LPV) dynamical system, it achieves computational complexity that scales linearly with the horizon length and the number of parameters, specifically . A key contribution is a BIBO-based stability analysis that provides conditions to avoid exploding gradients during identification, including barrier-penalty techniques to keep trajectories within stable regions. The approach is validated through numerical examples demonstrating substantial computational gains and effective mitigation of gradient explosion, with implications for scalable nonlinear identification and non-convex optimization in dynamical systems.

Abstract

This paper presents three main contributions to the field of multi-step system identification. First, drawing inspiration from Neural Network (NN) training, it introduces a tool for solving identification problems by leveraging first-order optimization and Automatic Differentiation (AD). The proposed method exploits gradients with respect to the parameters to be identified and leverages Linear Parameter-Varying (LPV) sensitivity equations to model gradient evolution. Second, it demonstrates that the computational complexity of the proposed method is linear in both the multi-step horizon length and the parameter size, ensuring scalability for large identification problems. Third, it formally addresses the "exploding gradient" issue: via a stability analysis of the LPV equations, it derives conditions for a reliable and efficient optimization and identification process for dynamical systems. Simulation results indicate that the proposed method is both effective and efficient, making it a promising tool for future research and applications in nonlinear system identification and non-convex optimization.
Paper Structure (19 sections, 5 theorems, 26 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 19 sections, 5 theorems, 26 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Proposition 1

Define the memory matrix as the matrix containing the total derivatives of the states with respect to $\theta$. Define vectors $\rho_k \in \mathbb R^{n_x}$, $\varrho_k \in \mathbb R^{n_\theta}$ as Then, the gradient evolution with respect to the parameter $\theta$ over the multi-step horizon $T$ is described by the following time-varying dynamical system for $k=1,\dots,T$, with $\Lambda_{0} \dot

Figures (2)

  • Figure 1: Multi-step model propagation.
  • Figure 2: Effect of barrier functions on mitigating exploding gradients for $N=50$ different system initial conditions, represented with $\pm1$ standard deviation bands around the mean trajectories.

Theorems & Definitions (10)

  • Remark 1: on the gradient LPV equations
  • Proposition 1: gradient dynamics -- $\theta$
  • Proposition 2: gradient dynamics -- $x_0$
  • Theorem 1: complexity analysis
  • Remark 2: backward AD in system identification
  • Remark 3: complexity improvement
  • Definition 1: non-exploding gradient
  • Theorem 2: gradient stability
  • Corollary 1: trajectory to gradient stability
  • Remark 4: trajectory stability via state barriers