Table of Contents
Fetching ...

Mixed Strategy Nash Equilibrium for Crowd Navigation

Max Muchen Sun, Francesca Baldini, Katie Hughes, Peter Trautman, Todd Murphey

TL;DR

The proposed algorithm is equivalent to solving a mixed strategy game for crowd navigation, and the algorithm guarantees the recovery of the global Nash equilibrium of the game, and is named Bayesian Recursive Nash Equilibrium (BRNE), a real-time model prediction crowd navigation framework.

Abstract

Robots navigating in crowded areas should negotiate free space with humans rather than fully controlling collision avoidance, as this can lead to freezing behavior. Game theory provides a framework for the robot to reason about potential cooperation from humans for collision avoidance during path planning. In particular, the mixed strategy Nash equilibrium captures the negotiation behavior under uncertainty, making it well suited for crowd navigation. However, computing the mixed strategy Nash equilibrium is often prohibitively expensive for real-time decision-making. In this paper, we propose an iterative Bayesian update scheme over probability distributions of trajectories. The algorithm simultaneously generates a stochastic plan for the robot and probabilistic predictions of other pedestrians' paths. We prove that the proposed algorithm is equivalent to solving a mixed strategy game for crowd navigation, and the algorithm guarantees the recovery of the global Nash equilibrium of the game. We name our algorithm Bayesian Recursive Nash Equilibrium (BRNE) and develop a real-time model prediction crowd navigation framework. Since BRNE is not solving a general-purpose mixed strategy Nash equilibrium but a tailored formula specifically for crowd navigation, it can compute the solution in real-time on a low-power embedded computer. We evaluate BRNE in both simulated environments and real-world pedestrian datasets. BRNE consistently outperforms non-learning and learning-based methods regarding safety and navigation efficiency. It also reaches human-level crowd navigation performance in the pedestrian dataset benchmark. Lastly, we demonstrate the practicality of our algorithm with real humans on an untethered quadruped robot with fully onboard perception and computation.

Mixed Strategy Nash Equilibrium for Crowd Navigation

TL;DR

The proposed algorithm is equivalent to solving a mixed strategy game for crowd navigation, and the algorithm guarantees the recovery of the global Nash equilibrium of the game, and is named Bayesian Recursive Nash Equilibrium (BRNE), a real-time model prediction crowd navigation framework.

Abstract

Robots navigating in crowded areas should negotiate free space with humans rather than fully controlling collision avoidance, as this can lead to freezing behavior. Game theory provides a framework for the robot to reason about potential cooperation from humans for collision avoidance during path planning. In particular, the mixed strategy Nash equilibrium captures the negotiation behavior under uncertainty, making it well suited for crowd navigation. However, computing the mixed strategy Nash equilibrium is often prohibitively expensive for real-time decision-making. In this paper, we propose an iterative Bayesian update scheme over probability distributions of trajectories. The algorithm simultaneously generates a stochastic plan for the robot and probabilistic predictions of other pedestrians' paths. We prove that the proposed algorithm is equivalent to solving a mixed strategy game for crowd navigation, and the algorithm guarantees the recovery of the global Nash equilibrium of the game. We name our algorithm Bayesian Recursive Nash Equilibrium (BRNE) and develop a real-time model prediction crowd navigation framework. Since BRNE is not solving a general-purpose mixed strategy Nash equilibrium but a tailored formula specifically for crowd navigation, it can compute the solution in real-time on a low-power embedded computer. We evaluate BRNE in both simulated environments and real-world pedestrian datasets. BRNE consistently outperforms non-learning and learning-based methods regarding safety and navigation efficiency. It also reaches human-level crowd navigation performance in the pedestrian dataset benchmark. Lastly, we demonstrate the practicality of our algorithm with real humans on an untethered quadruped robot with fully onboard perception and computation.
Paper Structure (24 sections, 6 theorems, 18 equations, 13 figures, 8 tables, 4 algorithms)

This paper contains 24 sections, 6 theorems, 18 equations, 13 figures, 8 tables, 4 algorithms.

Key Result

Theorem 1

The sequence of mixed strategies $\{(p_1^{[k]},\dots,p_N^{[k]})\}_k$ in Algorithm algo:multi_agent_update converges to a limit point $(p_1^*,\dots,p_N^*)$ such that: The limit point is the global Nash equilibrium (eq:mixed_ne_def) of the mixed strategy game (eq:general_sum_player_definition).

Figures (13)

  • Figure 1: Comparison of optimality criterion in different navigation frameworks. (a) In traditional robot navigation, the robot makes optimal decisions, such as minimizing the risk of collision, in a given and non-interactive environment; (b) Cooperative navigation finds optimal cooperative decisions for both the robot and the human. With the pure strategy Nash equilibrium model, the robot expects deterministic actions from humans, which is too assertive considering the uncertain nature of human behavior; (c) Our cooperative navigation framework uses mixed strategy Nash equilibrium as the optimality criterion, which finds probabilities of actions that represent the optimal cooperation strategies between the robot and human. This model maintains uncertainty during the interaction.
  • Figure 2: Examples of the iterative Bayesian update process. (a) Two-agent negotiation in one-dimensional space. (b) Two-agent hallway passing, where the mixed strategy is visualized as trajectory samples. (c) Four-agent crossing.
  • Figure 3: Illustration of specifying a nominal mixed strategy with a Gaussian process (GP). (a) We first specify a trajectory as the mean function of the GP. For the robot, it would be a trajectory toward the goal, generated by a meta-planner; (b) We then specify the covariance kernel parameters, either learned from datasets or hand-tuned, which give us the GP prior distribution; (c) The GP prior is insufficient as the nominal mixed strategy. It needs to be conditioned at specific time steps with user-specified marginal uncertainty. In the figure, we condition the GP prior on the first and last time step (the specified marginal uncertainty is shown as the red ellipse); the resulting distribution is the robot's nominal mixed strategy.
  • Figure 4: Illustration of the model predictive crowd navigation framework. (a) The robot first takes measurements of nearby pedestrians' positions and velocities; (b) The robot then generates the mean functions of the Gaussian processes for the pedestrians and for itself; (c) The robot specifies the Gaussian processes as the nominal mixed strategies for all agents and draws trajectory samples from them; (d) The weights of the trajectory samples are updated based on Algorithm \ref{['algo:multi_agent_update']} until convergence; (e) The mean of the robot's converged mixed strategy becomes the robot's planned trajectory.
  • Figure 5: Examples of the joint navigation strategies (8 agents) from different methods, in the multi-agent navigation experiments.
  • ...and 8 more figures

Theorems & Definitions (31)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Remark 1
  • Definition 6
  • Definition 7
  • Definition 8
  • Definition 9
  • ...and 21 more