Table of Contents
Fetching ...

Markov Potential Game with Final-time Reach-Avoid Objectives

Sarah H. Q. Li, Abraham P. Vinod

TL;DR

A Markov potential game with final-time reach-avoid objectives is formulated by integrating potential game theory with stochastic reach-avoid control and an iterative best response scheme for the multi-player value iteration to converge to a pure Nash equilibrium is proposed.

Abstract

We formulate a Markov potential game with final-time reach-avoid objectives by integrating potential game theory with stochastic reach-avoid control. Our focus is on multi-player trajectory planning where players maximize the same multi-player reach-avoid objective: the probability of all participants reaching their designated target states by a specified time, while avoiding collisions with one another. Existing approaches require centralized computation of actions via a global policy, which may have prohibitively expensive communication costs. Instead, we focus on approximations of the global policy via local state feedback policies. First, we adapt the recursive single player reach-avoid value iteration to the multi-player framework with local policies, and show that the same recursion holds on the joint state space. To find each player's optimal local policy, the multi-player reach-avoid value function is projected from the joint state to the local state using the other players' occupancy measures. Then, we propose an iterative best response scheme for the multi-player value iteration to converge to a pure Nash equilibrium. We demonstrate the utility of our approach in finding collision-free policies for multi-player motion planning in simulation.

Markov Potential Game with Final-time Reach-Avoid Objectives

TL;DR

A Markov potential game with final-time reach-avoid objectives is formulated by integrating potential game theory with stochastic reach-avoid control and an iterative best response scheme for the multi-player value iteration to converge to a pure Nash equilibrium is proposed.

Abstract

We formulate a Markov potential game with final-time reach-avoid objectives by integrating potential game theory with stochastic reach-avoid control. Our focus is on multi-player trajectory planning where players maximize the same multi-player reach-avoid objective: the probability of all participants reaching their designated target states by a specified time, while avoiding collisions with one another. Existing approaches require centralized computation of actions via a global policy, which may have prohibitively expensive communication costs. Instead, we focus on approximations of the global policy via local state feedback policies. First, we adapt the recursive single player reach-avoid value iteration to the multi-player framework with local policies, and show that the same recursion holds on the joint state space. To find each player's optimal local policy, the multi-player reach-avoid value function is projected from the joint state to the local state using the other players' occupancy measures. Then, we propose an iterative best response scheme for the multi-player value iteration to converge to a pure Nash equilibrium. We demonstrate the utility of our approach in finding collision-free policies for multi-player motion planning in simulation.

Paper Structure

This paper contains 12 sections, 3 theorems, 30 equations, 2 figures, 1 table, 3 algorithms.

Key Result

Lemma 1

Any real-valued function $G: \mathcal{S}^{NK} \mapsto {\mathbb{R}}$ that takes in a joint trajectory $\{\tau_i\}_{i\in\mathbb{N}}$, where $\tau_i \in \mathcal{S}^{K}$ and is a Markov process as described by $h(\pi_i)$eqn:markov_process_def, then the expectation of $G$ with respect to $\{\pi_i\}_{i\i where $\tau_i = (s_i(0),\ldots, s_i(K-1))$ for all $i \in [N]$.

Figures (2)

  • Figure 1: Reach-avoid metrics over different action stochasticity values (green to black and corresponds to $p = 0.95$ to $p = 0.75$).
  • Figure 2: Computation time (seconds) and best response iteration (k) vs state sizes.

Theorems & Definitions (7)

  • Definition 1: Local feedback
  • Definition 2: Nash equilibrium
  • Lemma 1
  • proof
  • Proposition 1
  • Theorem 1
  • proof