Table of Contents
Fetching ...

Linear Supervision for Nonlinear, High-Dimensional Neural Control and Differential Games

William Sharpless, Zeyuan Feng, Somil Bansal, Sylvia Herbert

TL;DR

This work tackles the curse of dimensionality in differential games by marrying linear $HJ$-$PDE$ solutions with deep learning to learn high-dimensional value functions $V$. It introduces two semi-supervision programs: decayed linear semi-supervision (LSS-D) and an augmented nonlinear spectrum with a parameterized $V_\lambda$, both designed to leverage a linear supervisor $V_\ell$ during training. The Hopf-based linear solution provides fast, global supervision, while the augmented approach enables a smooth interpolation to the nonlinear true solution, yielding improvements in speed and accuracy on a 50-D differential game and a 10-D quadrotor collision-avoidance task. The results show substantial gains in IOU and MSE for the high-dimensional benchmark and improved safety-volume metrics for the quadrotor scenario, highlighting a practical pathway to scalable, safer autonomous decision-making in complex multi-agent settings.

Abstract

As the dimension of a system increases, traditional methods for control and differential games rapidly become intractable, making the design of safe autonomous agents challenging in complex or team settings. Deep-learning approaches avoid discretization and yield numerous successes in robotics and autonomy, but at a higher dimensional limit, accuracy falls as sampling becomes less efficient. We propose using rapidly generated linear solutions to the partial differential equation (PDE) arising in the problem to accelerate and improve learned value functions for guidance in high-dimensional, nonlinear problems. We define two programs that combine supervision of the linear solution with a standard PDE loss. We demonstrate that these programs offer improvements in speed and accuracy in both a 50-D differential game problem and a 10-D quadrotor control problem.

Linear Supervision for Nonlinear, High-Dimensional Neural Control and Differential Games

TL;DR

This work tackles the curse of dimensionality in differential games by marrying linear - solutions with deep learning to learn high-dimensional value functions . It introduces two semi-supervision programs: decayed linear semi-supervision (LSS-D) and an augmented nonlinear spectrum with a parameterized , both designed to leverage a linear supervisor during training. The Hopf-based linear solution provides fast, global supervision, while the augmented approach enables a smooth interpolation to the nonlinear true solution, yielding improvements in speed and accuracy on a 50-D differential game and a 10-D quadrotor collision-avoidance task. The results show substantial gains in IOU and MSE for the high-dimensional benchmark and improved safety-volume metrics for the quadrotor scenario, highlighting a practical pathway to scalable, safer autonomous decision-making in complex multi-agent settings.

Abstract

As the dimension of a system increases, traditional methods for control and differential games rapidly become intractable, making the design of safe autonomous agents challenging in complex or team settings. Deep-learning approaches avoid discretization and yield numerous successes in robotics and autonomy, but at a higher dimensional limit, accuracy falls as sampling becomes less efficient. We propose using rapidly generated linear solutions to the partial differential equation (PDE) arising in the problem to accelerate and improve learned value functions for guidance in high-dimensional, nonlinear problems. We define two programs that combine supervision of the linear solution with a standard PDE loss. We demonstrate that these programs offer improvements in speed and accuracy in both a 50-D differential game problem and a 10-D quadrotor control problem.

Paper Structure

This paper contains 25 sections, 6 theorems, 52 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{S}_c(t)$ be the $c$-level set of $V$ at $t$ and $\bar{\mathcal{S}}(\tau)$ be a set containing any $\mathsf{x}(s)$ s.t. $J(\mathsf{x}(s')) \le c$ for $s,s' \in [\tau, t_f]$. Let where $\mathcal{E}(\tau) \triangleq \{ \Vert \varepsilon \Vert \le \delta^*(\tau)\}$ and $\delta^*(\tau) = \max_\Sigma \Vert f - \ell \Vert$ is defined on $\Sigma \triangleq \bar{\mathcal{S}}(\tau) \times \ma

Figures (3)

  • Figure 1: Demonstration of Thm. \ref{['thm:Vlam']} [$V_\lambda$] and Cor. \ref{['cor:linbdtay']} On top, the true value of $V_\lambda$ at $t=1$ for the problem posed in \ref{['def:auggame']} with $N=3$ along the range of $\lambda$ is given. Note the smooth change from $\lambda=0$, where $V_\lambda = V_\ell$, to $\lambda=1$, where $V_\lambda = V$. In the bottom row, the error between $V_\lambda$ and $V_\ell$ is plotted as $\lambda$ increases. Note the gradual increase in error and the large regions of $V$ with low error.
  • Figure 2: 50-D Benchmark Result Comparison On the left, a slice of the learned solution for four variations (columns) of \ref{['pubsubNd']} where $(\alpha, \beta) \in \{(20, 0), (-20, 0), (-20, 20), (10, -10)\}$ is shown for each proposed method (rows), and the ground truth zero-level set is overlaid in black. On the right, the IOU, MSEs, and run time are given for each of the variations and methods.
  • Figure 3: 10-D Quadrotor Result Comparison In the upper-left, the problem in which the drone is flying toward an obstacle is depicted with two trajectories demonstrating success and failure. On the right, slices of the sub-zero level set of the learned value that approximate the unsafe set are shown (gold), along with the 99.9%-confidence conformal expansion of the learned set (teal) and a sample of the roll-outs (blue if safe, else red). In the lower-left, a slice of the learned sets is shown before and after the conformal expansion.

Theorems & Definitions (11)

  • Theorem 1
  • Corollary 1
  • Definition 1: Linear Supervision Loss
  • Definition 2: Linear Semi-Supervision Loss, Decayed
  • Definition 3
  • Theorem 2
  • Definition 4: Linear Semi-Supervision Loss, Nonlinear Spectrum
  • Remark 1
  • Lemma 1
  • Lemma 2
  • ...and 1 more