Table of Contents
Fetching ...

Input-to-State Stable Coupled Oscillator Networks for Closed-form Model-based Control in Latent Space

Maximilian Stölzle, Cosimo Della Santina

TL;DR

This work tackles latent-space control of physical systems from high-dimensional observations by introducing Input-to-State Stable Coupled Oscillator Networks (CON), a Lagrangian-formulated, ISS-stable latent-dynamics model for learning dynamics and enabling model-based control in latent space. The authors prove Global Asymptotic Stability for the unforced CON via a Lyapunov candidate $V_\mu(\tilde{y}_w)$ and Global Input-to-State Stability for the forced system, while learning in pixels through a $\beta$-VAE and leveraging a potential-energy-based controller that combines feedforward potential compensation with a P-satI-D feedback term. To accelerate training, they derive CFA-CON, an approximate closed-form rollout that decouples linear dynamics from nonlinear residuals, and show CFA-CON can double training speed with minimal loss in accuracy. Empirically, CON and CFA-CON achieve competitive or state-of-the-art prediction across several mechanical and soft-robot datasets, with CON-M and CFA-CON delivering strong results on actuated systems, and a latent-space PID-based controller leveraging the learned potential guiding effective tracking. The work thus provides principled stability guarantees, efficient training, and practical model-based control capabilities directly from image-based observations, with clear avenues for future extension to more complex or non-ideal physical systems.

Abstract

Even though a variety of methods have been proposed in the literature, efficient and effective latent-space control (i.e., control in a learned low-dimensional space) of physical systems remains an open challenge. We argue that a promising avenue is to leverage powerful and well-understood closed-form strategies from control theory literature in combination with learned dynamics, such as potential-energy shaping. We identify three fundamental shortcomings in existing latent-space models that have so far prevented this powerful combination: (i) they lack the mathematical structure of a physical system, (ii) they do not inherently conserve the stability properties of the real systems, (iii) these methods do not have an invertible mapping between input and latent-space forcing. This work proposes a novel Coupled Oscillator Network (CON) model that simultaneously tackles all these issues. More specifically, (i) we show analytically that CON is a Lagrangian system - i.e., it possesses well-defined potential and kinetic energy terms. Then, (ii) we provide formal proof of global Input-to-State stability using Lyapunov arguments. Moving to the experimental side, we demonstrate that CON reaches SoA performance when learning complex nonlinear dynamics of mechanical systems directly from images. An additional methodological innovation contributing to achieving this third goal is an approximated closed-form solution for efficient integration of network dynamics, which eases efficient training. We tackle (iii) by approximating the forcing-to-input mapping with a decoder that is trained to reconstruct the input based on the encoded latent space force. Finally, we show how these properties enable latent-space control. We use an integral-saturated PID with potential force compensation and demonstrate high-quality performance on a soft robot using raw pixels as the only feedback information.

Input-to-State Stable Coupled Oscillator Networks for Closed-form Model-based Control in Latent Space

TL;DR

This work tackles latent-space control of physical systems from high-dimensional observations by introducing Input-to-State Stable Coupled Oscillator Networks (CON), a Lagrangian-formulated, ISS-stable latent-dynamics model for learning dynamics and enabling model-based control in latent space. The authors prove Global Asymptotic Stability for the unforced CON via a Lyapunov candidate and Global Input-to-State Stability for the forced system, while learning in pixels through a -VAE and leveraging a potential-energy-based controller that combines feedforward potential compensation with a P-satI-D feedback term. To accelerate training, they derive CFA-CON, an approximate closed-form rollout that decouples linear dynamics from nonlinear residuals, and show CFA-CON can double training speed with minimal loss in accuracy. Empirically, CON and CFA-CON achieve competitive or state-of-the-art prediction across several mechanical and soft-robot datasets, with CON-M and CFA-CON delivering strong results on actuated systems, and a latent-space PID-based controller leveraging the learned potential guiding effective tracking. The work thus provides principled stability guarantees, efficient training, and practical model-based control capabilities directly from image-based observations, with clear avenues for future extension to more complex or non-ideal physical systems.

Abstract

Even though a variety of methods have been proposed in the literature, efficient and effective latent-space control (i.e., control in a learned low-dimensional space) of physical systems remains an open challenge. We argue that a promising avenue is to leverage powerful and well-understood closed-form strategies from control theory literature in combination with learned dynamics, such as potential-energy shaping. We identify three fundamental shortcomings in existing latent-space models that have so far prevented this powerful combination: (i) they lack the mathematical structure of a physical system, (ii) they do not inherently conserve the stability properties of the real systems, (iii) these methods do not have an invertible mapping between input and latent-space forcing. This work proposes a novel Coupled Oscillator Network (CON) model that simultaneously tackles all these issues. More specifically, (i) we show analytically that CON is a Lagrangian system - i.e., it possesses well-defined potential and kinetic energy terms. Then, (ii) we provide formal proof of global Input-to-State stability using Lyapunov arguments. Moving to the experimental side, we demonstrate that CON reaches SoA performance when learning complex nonlinear dynamics of mechanical systems directly from images. An additional methodological innovation contributing to achieving this third goal is an approximated closed-form solution for efficient integration of network dynamics, which eases efficient training. We tackle (iii) by approximating the forcing-to-input mapping with a decoder that is trained to reconstruct the input based on the encoded latent space force. Finally, we show how these properties enable latent-space control. We use an integral-saturated PID with potential force compensation and demonstrate high-quality performance on a soft robot using raw pixels as the only feedback information.
Paper Structure (78 sections, 13 theorems, 74 equations, 25 figures, 10 tables, 1 algorithm)

This paper contains 78 sections, 13 theorems, 74 equations, 25 figures, 10 tables, 1 algorithm.

Key Result

Lemma 1

Let $K_\mathrm{w} \succ 0$. Then, the dynamics defined in eq:conw_dynamics have a single, isolated equilibrium $\bar{y}_\mathrm{w} = ^\mathrm{T}$.

Figures (25)

  • Figure 1: Panel (a): The proposed CON network consists of $n$ damped harmonic oscillators that are coupled through the neuron-like connection $\tanh(Wx+b)$ and the non-diagonal stiffness $K-k$ and damping coefficients $D-d$, respectively. The state of the network is captured by the positions $x(t)$ and velocities $\dot{x}(t)$ of the oscillators. The time-dependent input is mapped through the (possibly nonlinear) function $g(u)$ to a forcing $\tau$ acting on the oscillators. Panel (b): Exploiting CON for learning latent dynamics from pixels: We encode the initial observation $o(t_0)$ and the input $u(t)$ into latent space where we leverage the CON to predict future latent states. Finally, we decode both the latent-space torques $\tau(t)$ and the predicted latent states $z(t)$.
  • Figure 2: Analysis of approximation error of CFA-CON: we compare the ground-truth solution of a 40s rollout of the CON network consisting of three oscillators ($n=3$) with the CFA-CON executed at a time step of $\delta t = 0.1s$ and a solution generated by integrating the ODE at a time step of $\delta t = 0.05s$ with the Euler method.
  • Figure 3: Evaluation of prediction performance of the various models vs. the dimension of their latent representation $n_z$ and the number of trainable parameters of the dynamics model, respectively, on the PCC-NS-2 dataset. All hyperparameters are tuned for each model separately for $n_z=8$. The error bar denotes the standard deviation across three random seeds.
  • Figure 4: Panel (a): Samples of some of the datasets used as part of the experimental verification, specifically for the results reported in Tab. \ref{['tab:latent_dynamics_results']}. The real-world Reaction-Diffusion image is adopted from epstein2016reaction. Panel (b): Model-based control in latent space by exploiting the physical structure of the CON model.
  • Figure 5: Latent-space control of a continuum soft robot (simulated using two piecewise constant curvature segments) at following a sequence of setpoints: The upper two rows show the performance of a pure P-satI-D feedback controller operating in latent space $z$ learned with the MECH-NODE and CON models, respectively. The lower row displays the results for a latent space controller based on the CON model that additionally also compensates for the learned potential forces.
  • ...and 20 more figures

Theorems & Definitions (29)

  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Lemma 3
  • proof
  • ...and 19 more