A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems

Mostafa M. Shibl; Vijay Gupta

A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems

Mostafa M. Shibl, Vijay Gupta

TL;DR

This work considers the case where the dynamics of the coupled system can be modeled as a Markov potential game, and shows that by limiting information flow to local neighborhoods, agents’ policies can still converge to near-optimal policies.

Abstract

Learning in games provides a powerful framework to design control policies for self-interested agents that may be coupled through their dynamics, costs, or constraints. We consider the case where the dynamics of the coupled system can be modeled as a Markov potential game. In this case, distributed learning by the agents ensures that their control policies converge to a Nash equilibrium of this game. However, typical learning algorithms such as natural policy gradient require knowledge of the entire global state and actions of all the other agents, and may not be scalable as the number of agents grows. We show that by limiting the information flow to a local neighborhood of agents in the natural policy gradient algorithm, we can converge to a neighborhood of optimal policies. If the game can be designed through decomposing a global cost function of interest to a designer into local costs for the agents such that their policies at equilibrium optimize the global cost, this approach can be of interest to team coordination problems as well. We illustrate our approach through a sensor coverage problem.

A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems

TL;DR

Abstract

Paper Structure (10 sections, 8 theorems, 21 equations, 6 figures)

This paper contains 10 sections, 8 theorems, 21 equations, 6 figures.

Introduction
Model
Markov Potential Game
Independent Natural Policy Gradient Algorithm
Problem Considered
Proposed Algorithm
Illustrative Examples
Job Balancing Game Problem
Sensor Coverage Problem
Conclusion & Future Work

Key Result

Theorem 1

Consider an MPG in which all agents update their policies according to independent natural policy gradient algorithm. For a sufficiently small step size $\eta$, independent natural policy gradient exhibits last-iterate (asymptotic) convergence to the optimal Nash equilibrium policy.

Figures (6)

Figure 1: Job Balancing Game Network Diagram
Figure 2: Convergence Results of Independent Natural Policy Gradient for Job Balancing Game Problem
Figure 3: Percentage relative error of $\epsilon$ based on $\kappa$ for Job Balancing Game Problem
Figure 4: Sensor Coverage Node Diagram
Figure 5: Convergence Results of Independent Natural Policy Gradient for Sensor Coverage Problem
...and 1 more figures

Theorems & Definitions (15)

Definition 1
Definition 2: Equilibrium and $\epsilon$-Equilibrium Joint Policies
Theorem 1: Theorem 1.1 in rf
Theorem 2
Lemma 3
proof
Lemma 4
proof
Lemma 5
proof
...and 5 more

A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems

TL;DR

Abstract

A Scalable Game Theoretic Approach for Coordination of Multiple Dynamic Systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (15)