Table of Contents
Fetching ...

Stability-Preserving Online Adaptation of Neural Closed-loop Maps

Danilo Saccani, Luca Furieri, Giancarlo Ferrari-Trecate

Abstract

The growing complexity of modern control tasks calls for controllers that can react online as objectives and disturbances change, while preserving closed-loop stability. Recent approaches for improving the performance of nonlinear systems while preserving closed-loop stability rely on time-invariant recurrent neural-network controllers, but offer no principled way to update the controller during operation. Most importantly, switching from one stabilizing policy to another can itself destabilize the closed-loop. We address this problem by introducing a stability-preserving update mechanism for nonlinear, neural-network-based controllers. Each controller is modeled as a causal operator with bounded $\ell_p$-gain, and we derive gain-based conditions under which the controller may be updated online. These conditions yield two practical update schemes, time-scheduled and state-triggered, that guarantee the closed-loop remains $\ell_p$-stable after any number of updates. Our analysis further shows that stability is decoupled from controller optimality, allowing approximate or early-stopped controller synthesis. We demonstrate the approach on nonlinear systems with time-varying objectives and disturbances, and show consistent performance improvements over static and naive online baselines while guaranteeing stability.

Stability-Preserving Online Adaptation of Neural Closed-loop Maps

Abstract

The growing complexity of modern control tasks calls for controllers that can react online as objectives and disturbances change, while preserving closed-loop stability. Recent approaches for improving the performance of nonlinear systems while preserving closed-loop stability rely on time-invariant recurrent neural-network controllers, but offer no principled way to update the controller during operation. Most importantly, switching from one stabilizing policy to another can itself destabilize the closed-loop. We address this problem by introducing a stability-preserving update mechanism for nonlinear, neural-network-based controllers. Each controller is modeled as a causal operator with bounded -gain, and we derive gain-based conditions under which the controller may be updated online. These conditions yield two practical update schemes, time-scheduled and state-triggered, that guarantee the closed-loop remains -stable after any number of updates. Our analysis further shows that stability is decoupled from controller optimality, allowing approximate or early-stopped controller synthesis. We demonstrate the approach on nonlinear systems with time-varying objectives and disturbances, and show consistent performance improvements over static and naive online baselines while guaranteeing stability.
Paper Structure (16 sections, 3 theorems, 52 equations, 4 figures)

This paper contains 16 sections, 3 theorems, 52 equations, 4 figures.

Key Result

Theorem 1

(adapted from Theorem 1 in furieriPerformance) Assume that the operator $\mathbfcal{F}$ is $\ell_p$-stable, and consider the evolution of eq:operatorForm where $\mathbf{u}$ is defined as for a causal operator $\mathbfcal{M}:\ell^n\rightarrow \ell^m$. Let $\mathbf{K}$ be the operator for which $\mathbf{u=K(x)}$ is equivalent to eq:SLSinput. The following two statements hold true.

Figures (4)

  • Figure 1: IMC architecture parametrizing all stability-preserving controllers via a free operator $\mathbfcal{M}\in\mathcal{L}_p$.
  • Figure 2: Qualitative closed-loop behavior of Algorithm 1 in the two case studies. (Left) Mountains problem at $\tau = 0.8\,\mathrm{s}$. Green: obstacles. Colored lines: predicted trajectories over $[\tau,\tau+1.25]$; black: executed trajectory over $[0,\tau)$; gray dashed: predicted continuation after $\tau$. Colored disks show agent positions and radii. (Right) Dynamic-obstacles problem at $\tau_1 = 9.3\,\mathrm{s}$. Gray: executed trajectory; red dash-dot: reference; blue: predicted trajectory over $[\tau_1,\tau_1+0.5]$; green: obstacle positions at $\tau_1$.
  • Figure 3: Mountains problem - Total cost (log scale) over 50 runs: nominal scenario (center) and perturbed scenario with impulse disturbances \ref{['eq:delta']} (right).
  • Figure 4: Dynamic obstacles problem - Total cost (log scale) over 50 runs for the offline controller in furieriPerformance, a receding-horizon open-loop (RHO) planner, and the proposed approach.

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 2
  • Corollary 1: ISS under time-scheduled persistent updates
  • proof