Table of Contents
Fetching ...

Lyapunov Stability Learning with Nonlinear Control via Inductive Biases

Yupu Lu, Shijie Lin, Hao Xu, Zeqing Zhang, Jia Pan

TL;DR

This work tackles stability guarantees for nonlinear control by learning neural control Lyapunov functions (CLFs) and CLF-based controllers within an end-to-end framework. It reframes Lyapunov conditions as inductive biases, enabling a self-supervised learning pipeline that jointly estimates system dynamics, a CLF, and a bounded controller, while integrating verification into training. The proposed sum-of-squares neural CLF with a bounded controller, coupled with a Lyapunov-based loss and optional geometric shaping, yields higher convergence rates and larger regions of attraction (ROA) than prior approaches, as demonstrated on unicycle PF, inverted pendulum, and extensions to more complex 4-DOF and 6-DOF systems. This approach simplifies implementation by avoiding external verifiers and improves robustness and scalability for safety-critical nonlinear control applications.

Abstract

Finding a control Lyapunov function (CLF) in a dynamical system with a controller is an effective way to guarantee stability, which is a crucial issue in safety-concerned applications. Recently, deep learning models representing CLFs have been applied into a learner-verifier framework to identify satisfiable candidates. However, the learner treats Lyapunov conditions as complex constraints for optimisation, which is hard to achieve global convergence. It is also too complicated to implement these Lyapunov conditions for verification. To improve this framework, we treat Lyapunov conditions as inductive biases and design a neural CLF and a CLF-based controller guided by this knowledge. This design enables a stable optimisation process with limited constraints, and allows end-to-end learning of both the CLF and the controller. Our approach achieves a higher convergence rate and larger region of attraction (ROA) in learning the CLF compared to existing methods among abundant experiment cases. We also thoroughly reveal why the success rate decreases with previous methods during learning.

Lyapunov Stability Learning with Nonlinear Control via Inductive Biases

TL;DR

This work tackles stability guarantees for nonlinear control by learning neural control Lyapunov functions (CLFs) and CLF-based controllers within an end-to-end framework. It reframes Lyapunov conditions as inductive biases, enabling a self-supervised learning pipeline that jointly estimates system dynamics, a CLF, and a bounded controller, while integrating verification into training. The proposed sum-of-squares neural CLF with a bounded controller, coupled with a Lyapunov-based loss and optional geometric shaping, yields higher convergence rates and larger regions of attraction (ROA) than prior approaches, as demonstrated on unicycle PF, inverted pendulum, and extensions to more complex 4-DOF and 6-DOF systems. This approach simplifies implementation by avoiding external verifiers and improves robustness and scalability for safety-critical nonlinear control applications.

Abstract

Finding a control Lyapunov function (CLF) in a dynamical system with a controller is an effective way to guarantee stability, which is a crucial issue in safety-concerned applications. Recently, deep learning models representing CLFs have been applied into a learner-verifier framework to identify satisfiable candidates. However, the learner treats Lyapunov conditions as complex constraints for optimisation, which is hard to achieve global convergence. It is also too complicated to implement these Lyapunov conditions for verification. To improve this framework, we treat Lyapunov conditions as inductive biases and design a neural CLF and a CLF-based controller guided by this knowledge. This design enables a stable optimisation process with limited constraints, and allows end-to-end learning of both the CLF and the controller. Our approach achieves a higher convergence rate and larger region of attraction (ROA) in learning the CLF compared to existing methods among abundant experiment cases. We also thoroughly reveal why the success rate decreases with previous methods during learning.

Paper Structure

This paper contains 11 sections, 15 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: An illustration of a CLF candidate satisfying Lyapunov conditions within the closed state region $\Omega$. It displays the maximum level set $\mathcal{D}$ that can be found. The CLF value is strictly decreasing with respect to time in $\mathcal{D}$ because it is a closed region.
  • Figure 2: The self-supervised framework to synthetically learn the neural CLF and the CLF-based controller for nonlinear dynamics. Solid black arrows represent forward flows: Given the state $\mathbf{s}$, the values of dynamics $\mathbf{f, g}$ and Lyapunov function $V, \nabla V$ are obtained and then fed into the CLF-based control policy for control ${\mathbf{u}}$. Then the loss is calculated for optimisation. Red dash-dot arrows represent backpropagation flows for different loss terms.
  • Figure 3: The training results of our method. (a) shows the learned neural CLF candidates (colormaps) and the derivatives (blue wireframes) for inverted pendulum dynamics (left) and path following (right) in 3D space. The contour of the CLF is also projected on the plane $V=0$. (b) presents ROAs (black contours) of the CLFs. The area of ROA grows extensively larger with the geometric shaping term (right) than that without the term (left).
  • Figure 4: Four representative training cases of NLC and ULC. (a-b) plot the neural CLF candidates $V(\mathbf{s})$ passing the verification, but (b) is not a CLF. (c-d) show the derivatives $\dot{V}(\mathbf{s})$, where the verification can pass in (c) but cannot in (d). Regions overlooked by the verifier ($\{\mathbf{s}\,|\,\Vert\mathbf{s}\Vert < 0.1\}$) and regions conflicting the Lyapunov condition ($\dot{V}(\mathbf{s})>0$) are outlined by dark red dash circles and black dash-dot contours, respectively. Colourful dash lines represent simulated trajectories, where black stars are end points. By contrast, the equilibrium points are marked with red stars.
  • Figure 5: The area of ROA and success rate related to the value of $\eta_1$ in inverted pendulum (a) and path following dynamics (b). Increasing $\eta_1$ can enlarge the area of ROA, while can further decrease the success rate in training.
  • ...and 1 more figures