Table of Contents
Fetching ...

Time-optimal neural feedback control of nilpotent systems as a binary classification problem

Sara Bicego, Samuel Gue, Dante Kalise, Nelly Villamizar

TL;DR

This work addresses time-optimal control of linear nilpotent systems with scalar bang-bang inputs by two intertwined strategies: (i) solving the switching-time polynomial system via a deflated Newton method augmented with Hermite quadratic form-based root counting to bound the number of real roots, thereby enabling real-time computation of switching sequences; and (ii) learning a time-optimal feedback map as a binary classifier using synthetic data generated from optimal trajectories, with a confidence-aware augmentation that invokes the open-loop solver when needed. The authors demonstrate scalability up to 5th-order integrators, achieving high classification accuracy (up to 99.61%) and showing robustness to noise, while reducing reliance on computationally expensive algebraic tools such as Gröbner bases. The combination of algebraic root-finding bounds and data-driven feedback yields a practical framework for real-time time-optimal control in nilpotent systems, with clear pathways for extension to other time-optimal or sparse control problems. Overall, the paper provides a scalable pipeline that integrates deflation-based root search, Hermite-based root counting, and neural classifiers to synthesize robust, real-time Bang-bang feedback laws.

Abstract

A computational method for the synthesis of time-optimal feedback control laws for linear nilpotent systems is proposed. The method is based on the use of the bang-bang theorem, which leads to a characterization of the time-optimal trajectory as a parameter-dependent polynomial system for the control switching sequence. A deflated Newton's method is then applied to exhaust all the real roots of the polynomial system. The root-finding procedure is informed by the Hermite quadratic form, which provides a sharp estimate on the number of real roots to be found. In the second part of the paper, the polynomial systems are sampled and solved to generate a synthetic dataset for the construction of a time-optimal deep neural network -- interpreted as a binary classifier -- via supervised learning. Numerical tests in integrators of increasing dimension assess the accuracy, robustness, and real-time-control capabilities of the approximate control law.

Time-optimal neural feedback control of nilpotent systems as a binary classification problem

TL;DR

This work addresses time-optimal control of linear nilpotent systems with scalar bang-bang inputs by two intertwined strategies: (i) solving the switching-time polynomial system via a deflated Newton method augmented with Hermite quadratic form-based root counting to bound the number of real roots, thereby enabling real-time computation of switching sequences; and (ii) learning a time-optimal feedback map as a binary classifier using synthetic data generated from optimal trajectories, with a confidence-aware augmentation that invokes the open-loop solver when needed. The authors demonstrate scalability up to 5th-order integrators, achieving high classification accuracy (up to 99.61%) and showing robustness to noise, while reducing reliance on computationally expensive algebraic tools such as Gröbner bases. The combination of algebraic root-finding bounds and data-driven feedback yields a practical framework for real-time time-optimal control in nilpotent systems, with clear pathways for extension to other time-optimal or sparse control problems. Overall, the paper provides a scalable pipeline that integrates deflation-based root search, Hermite-based root counting, and neural classifiers to synthesize robust, real-time Bang-bang feedback laws.

Abstract

A computational method for the synthesis of time-optimal feedback control laws for linear nilpotent systems is proposed. The method is based on the use of the bang-bang theorem, which leads to a characterization of the time-optimal trajectory as a parameter-dependent polynomial system for the control switching sequence. A deflated Newton's method is then applied to exhaust all the real roots of the polynomial system. The root-finding procedure is informed by the Hermite quadratic form, which provides a sharp estimate on the number of real roots to be found. In the second part of the paper, the polynomial systems are sampled and solved to generate a synthetic dataset for the construction of a time-optimal deep neural network -- interpreted as a binary classifier -- via supervised learning. Numerical tests in integrators of increasing dimension assess the accuracy, robustness, and real-time-control capabilities of the approximate control law.

Paper Structure

This paper contains 14 sections, 3 theorems, 39 equations, 7 figures, 1 algorithm.

Key Result

Theorem 2.1

Assume that Kalman's controllability condition holds for the pair (A,$\bm{b}$), and that all the eigenvalues of $A$ are real. Then, for every initial condition $\bm{x}$, there exists a unique time-optimal control signal $u$. This control signal is bang-bang, i.e. taking values in $\{-1,1\}$, with at

Figures (7)

  • Figure 1: Graph of the two dimensional polynomial system from Example \ref{['exampleH']}. The number of real solutions of the system depends on the sign of the diagonal elements in the matrix $\mathcal{H}(J)$. We have three cases: $x_2^2-2x_1>0$ (left), $x_2^2-2x_1=0$ (middle), and $x_2^2-2x_1<0$ (right).
  • Figure 2: The trained classifier identifies the switching surface as a low-confidence region.
  • Figure 3: Comparison of the optimal trajectory computed via deflation with the one controlled with the classifier $u_\theta$. On the left, the NN feedback is solely determined via $u_\theta$, whilst on the right we rely on the identification of the solution of the polynomial system \ref{['eq:2dtest']} to identify the feedback at low-confidence points. $T$ denotes the exact minimum time.
  • Figure 4: Comparison of the optimal trajectory with the $u_\theta$-controlled (left) and with the confidence-enhanced approximation (right). Relying on the polynomial solver for low-confidence points improves the approximation, as the time $T_{NN}$ needed to reach the origin decreases.
  • Figure 5: Mean and variance of a Monte Carlo simulation with $1000$ trajectories perturbed by Gaussian noise. The robustness of the approximated feedback law ensures that the controlled system reaches the target destination, while the open-loop solution, obtained as the solution of the polynomial system \ref{['eq:polynomialSystemB0']}, diverges from it.
  • ...and 2 more figures

Theorems & Definitions (9)

  • Theorem 2.1: Chapter III, PMP
  • Proposition 2.2
  • proof
  • Example 2.3
  • Definition 4.1: Hermite quadratic form
  • Example 4.2
  • Proposition 4.3
  • Remark 4.4
  • Example 4.5