Table of Contents
Fetching ...

Lyapunov Neural ODE State-Feedback Control Policies

Joshua Hang Sai Ip, Georgios Makrygiorgos, Ali Mesbah

TL;DR

This work tackles solving constrained continuous-time OCPs by learning a state-feedback policy within a Neural ODE framework while guaranteeing stability. It introduces Lyapunov-NODE control (L-NODEC), which embeds an exponentially-stabilizing control Lyapunov function (ES-CLF) into a Lyapunov loss that enforces $\frac{\rd V}{\rd x}^\top \mathcal{F}_{\theta}(x,t)+\kappa V(x)\le 0$ and yields exponential convergence $\|x(t)-z\|_P \le e^{-\kappa t/2}\|x_0-z\|_P$ when the loss vanishes. The authors prove that zero Lyapunov loss implies ES-CLF and provide an adversarial robustness bound for initial-state perturbations, plus a learning framework that handles state and input constraints through penalties. They demonstrate two case studies—Double Integrator and thermal-dose delivery in plasma medicine—showing faster target attainment and improved robustness compared to NODEC, highlighting practical potential for safety-critical, constraint-bound control tasks.

Abstract

Deep neural networks are increasingly used as an effective parameterization of control policies in various learning-based control paradigms. For continuous-time optimal control problems (OCPs), which are central to many decision-making tasks, control policy learning can be cast as a neural ordinary differential equation (NODE) problem wherein state and control constraints are naturally accommodated. This paper presents a NODE approach to solving continuous-time OCPs for the case of stabilizing a known constrained nonlinear system around a target state. The approach, termed Lyapunov-NODE control (L-NODEC), uses a novel Lyapunov loss formulation that incorporates an exponentially-stabilizing control Lyapunov function to learn a state-feedback neural control policy, bridging the gap of solving continuous-time OCPs via NODEs with stability guarantees. The proposed Lyapunov loss allows L-NODEC to guarantee exponential stability of the controlled system, as well as its adversarial robustness to perturbations to the initial state. The performance of L-NODEC is illustrated in two problems, including a dose delivery problem in plasma medicine. In both cases, L-NODEC effectively stabilizes the controlled system around the target state despite perturbations to the initial state and reduces the inference time necessary to reach the target.

Lyapunov Neural ODE State-Feedback Control Policies

TL;DR

This work tackles solving constrained continuous-time OCPs by learning a state-feedback policy within a Neural ODE framework while guaranteeing stability. It introduces Lyapunov-NODE control (L-NODEC), which embeds an exponentially-stabilizing control Lyapunov function (ES-CLF) into a Lyapunov loss that enforces and yields exponential convergence when the loss vanishes. The authors prove that zero Lyapunov loss implies ES-CLF and provide an adversarial robustness bound for initial-state perturbations, plus a learning framework that handles state and input constraints through penalties. They demonstrate two case studies—Double Integrator and thermal-dose delivery in plasma medicine—showing faster target attainment and improved robustness compared to NODEC, highlighting practical potential for safety-critical, constraint-bound control tasks.

Abstract

Deep neural networks are increasingly used as an effective parameterization of control policies in various learning-based control paradigms. For continuous-time optimal control problems (OCPs), which are central to many decision-making tasks, control policy learning can be cast as a neural ordinary differential equation (NODE) problem wherein state and control constraints are naturally accommodated. This paper presents a NODE approach to solving continuous-time OCPs for the case of stabilizing a known constrained nonlinear system around a target state. The approach, termed Lyapunov-NODE control (L-NODEC), uses a novel Lyapunov loss formulation that incorporates an exponentially-stabilizing control Lyapunov function to learn a state-feedback neural control policy, bridging the gap of solving continuous-time OCPs via NODEs with stability guarantees. The proposed Lyapunov loss allows L-NODEC to guarantee exponential stability of the controlled system, as well as its adversarial robustness to perturbations to the initial state. The performance of L-NODEC is illustrated in two problems, including a dose delivery problem in plasma medicine. In both cases, L-NODEC effectively stabilizes the controlled system around the target state despite perturbations to the initial state and reduces the inference time necessary to reach the target.
Paper Structure (13 sections, 37 equations, 3 figures, 1 algorithm)

This paper contains 13 sections, 37 equations, 3 figures, 1 algorithm.

Figures (3)

  • Figure 1: Phase portraits of the controlled double integrator system for L-NODEC (left), L-NODEC Constrained (middle), and NODEC (right). The nominal trajectory and adversarial trajectories are shown in black and orange, respectively. The initial states for the adversarial trajectories are based on different initial positions $x_1$ sampled from a uniform distribution over [-0.05, 0.05] with the nominal initial velocity 0. The streamlines in the vector field and the constraint for velocity $x_2$ are shown in blue and red, respectively.
  • Figure 2: Lyapunov decay, defined as the ratio of the potential functions at times $t$ and $t = 0$, for L-NODEC, L-NODEC (Constrained), and NODEC. The decay is shown for nominal and 20 adversarial trajectories. The exponential stability threshold for $\kappa =5$ is shown in black.
  • Figure 3: Optimal control of thermal dose delivery of cold atmospheric plasma to a target surface for L-NODEC. (a) The delivered thermal dose CEM. (b) Surface temperature. (c) Control input, i.e., applied power to plasma. 50 adversarial trajectories with initial states in a perturbation radius of $2^\circ$C around the nominal initial temperature are simulated and truncated once the desired CEM threshold is met. L-NODEC enables shorter treatment protocols, which are highly desirable in plasma medicine.

Theorems & Definitions (1)

  • proof