Table of Contents
Fetching ...

Think Deep and Fast: Learning Neural Nonlinear Opinion Dynamics from Inverse Dynamic Games for Split-Second Interactions

Haimin Hu, Jaime Fernández Fisac, Naomi Ehrich Leonard, Deepak Gopinath, Jonathan DeCastro, Guy Rosman

TL;DR

The paper tackles the challenge of rapid, safe decision-making in non-cooperative multi-agent settings where deadlocks can occur. It introduces Neural Nonlinear Opinion Dynamics (Neural NOD) learned from inverse dynamic games, enabling online tuning of game costs via opinion-state dynamics. The approach combines a neural parameterization that maps states to time-varying cost weights with an offline training pipeline that differentiates through a forward game solver, plus analytical conditions for indecision-breaking through a pitchfork bifurcation. Empirical results in autonomous racing show Neural NOD outperforms state-of-the-art data-driven inverse game baselines on synthetic and human data, achieving safer and more decisive overtakes and demonstrating practical potential for fast, interpretable, and robust multi-agent planning.

Abstract

Non-cooperative interactions commonly occur in multi-agent scenarios such as car racing, where an ego vehicle can choose to overtake the rival, or stay behind it until a safe overtaking "corridor" opens. While an expert human can do well at making such time-sensitive decisions, autonomous agents are incapable of rapidly reasoning about complex, potentially conflicting options, leading to suboptimal behaviors such as deadlocks. Recently, the nonlinear opinion dynamics (NOD) model has proven to exhibit fast opinion formation and avoidance of decision deadlocks. However, NOD modeling parameters are oftentimes assumed fixed, limiting their applicability in complex and dynamic environments. It remains an open challenge to determine such parameters automatically and adaptively, accounting for the ever-changing environment. In this work, we propose for the first time a learning-based and game-theoretic approach to synthesize a Neural NOD model from expert demonstrations, given as a dataset containing (possibly incomplete) state and action trajectories of interacting agents. We demonstrate Neural NOD's ability to make fast and deadlock-free decisions in a simulated autonomous racing example. We find that Neural NOD consistently outperforms the state-of-the-art data-driven inverse game baseline in terms of safety and overtaking performance.

Think Deep and Fast: Learning Neural Nonlinear Opinion Dynamics from Inverse Dynamic Games for Split-Second Interactions

TL;DR

The paper tackles the challenge of rapid, safe decision-making in non-cooperative multi-agent settings where deadlocks can occur. It introduces Neural Nonlinear Opinion Dynamics (Neural NOD) learned from inverse dynamic games, enabling online tuning of game costs via opinion-state dynamics. The approach combines a neural parameterization that maps states to time-varying cost weights with an offline training pipeline that differentiates through a forward game solver, plus analytical conditions for indecision-breaking through a pitchfork bifurcation. Empirical results in autonomous racing show Neural NOD outperforms state-of-the-art data-driven inverse game baselines on synthetic and human data, achieving safer and more decisive overtakes and demonstrating practical potential for fast, interpretable, and robust multi-agent planning.

Abstract

Non-cooperative interactions commonly occur in multi-agent scenarios such as car racing, where an ego vehicle can choose to overtake the rival, or stay behind it until a safe overtaking "corridor" opens. While an expert human can do well at making such time-sensitive decisions, autonomous agents are incapable of rapidly reasoning about complex, potentially conflicting options, leading to suboptimal behaviors such as deadlocks. Recently, the nonlinear opinion dynamics (NOD) model has proven to exhibit fast opinion formation and avoidance of decision deadlocks. However, NOD modeling parameters are oftentimes assumed fixed, limiting their applicability in complex and dynamic environments. It remains an open challenge to determine such parameters automatically and adaptively, accounting for the ever-changing environment. In this work, we propose for the first time a learning-based and game-theoretic approach to synthesize a Neural NOD model from expert demonstrations, given as a dataset containing (possibly incomplete) state and action trajectories of interacting agents. We demonstrate Neural NOD's ability to make fast and deadlock-free decisions in a simulated autonomous racing example. We find that Neural NOD consistently outperforms the state-of-the-art data-driven inverse game baseline in terms of safety and overtaking performance.
Paper Structure (12 sections, 2 theorems, 9 equations, 5 figures, 4 tables)

This paper contains 12 sections, 2 theorems, 9 equations, 5 figures, 4 tables.

Key Result

Lemma 1

If there exist ${i}\in\mathcal{I}_{a}$ and $\ell\in\mathcal{I}_{o^i}$ such that $\alpha^{i}_\ell + \sigma^{i}_\ell(\bar{{\mathbf{J}}}_0) > 0$, where Jacobian matrix $\bar{{\mathbf{J}}}_0$ is defined as $\left.{\mathbf{J}}(\mathop{\mathrm{col}}\nolimits(\{\bar{{S}}^{i} ({z}^{{i}};\eta^{i})\}_{{i} \in

Figures (5)

  • Figure 1: Rapid and resolute decision-making is essential for non-cooperative multi-agent interactions like car racing. Top: During the 2021 Formula 1 Italian Grand Prix, a fatal collision occurred involving championship contenders Max Verstappen and Lewis Hamilton. Verstappen was deemed predominantly responsible because, despite the overtaking opportunity closing after Hamilton (orange triangle) led him into the corner, he had options to avoid the collision by slowing down or taking the emergency alternative route (green arrows), but he failed to make a timely decision, continuing along the racing line (red arrow) and ultimately leading to an inevitable collision later on. Middle: A similar scenario arises in simulated autonomous racing when the ego car (red) uses an indecisive policy, hesitating between overtaking the rival (silver) from the inside or outside of the corner (as seen in its planned future motions depicted in transparent snapshots), ultimately resulting in a collision. Bottom: The proposed Neural model reasons split-second strategic interactions between the agents, rendering safe and decisive overtaking maneuvers.
  • Figure 2: Computation graph of the inverse game for training a Neural model illustrated with the autonomous racing example.
  • Figure 3: Simulation snapshots and the time evolution of game cost weights, when the ego vehicle uses the Neural learned from the synthetic dataset. Planned future motions are displayed with transparency. The racing line is plotted in dashed black. The ego car made a timely decision to speed up and move to the outside, safely overtaking the rival.
  • Figure 4: Comparing a simulated gameplay trajectory against the groundtruth. Top: Groundtruth trajectories of the ego and rival. Middle: Simulation snapshots when the ego vehicle uses the Neural to race against a rival, whose motion is replayed from the groundtruth data. Bottom: Time evolution of game cost weights tuned by the Neural .
  • Figure 5: Ego trajectory and velocity profile of the full endurance race at the Thunderhill Raceway. Transparent footprints denote the planned motion of the ego (red) and rival (silver). The black arrow and the grid indicate the track direction and finish line, respectively.

Theorems & Definitions (7)

  • Example 1
  • Remark 1: features
  • Example 2
  • Lemma 1
  • Theorem 1: Guaranteed Indecision Breaking
  • proof
  • Remark 2: Information Privilege