Table of Contents
Fetching ...

Mean-Field Games With Finitely Many Players: Independent Learning and Subjectivity

Bora Yongacoglu, Gürdal Arslan, Serdar Yüksel

TL;DR

A decentralized learning algorithm is provided for partially observed n-player mean-field games, and it is shown that it drives play to subjective Q-equilibrium by adapting the recently developed theory of satisficing paths to allow for subjectivity.

Abstract

Independent learners are agents that employ single-agent algorithms in multi-agent systems, intentionally ignoring the effect of other strategic agents. This paper studies mean-field games from a decentralized learning perspective, with two primary objectives: (i) to identify structure that can guide algorithm design, and (ii) to understand the emergent behaviour in systems of independent learners. We study a new model of partially observed mean-field games with finitely many players, local action observability, and a general observation channel for partial observations of the global state. Specific observation channels considered include (a) global observability, (b) local and mean-field observability, (c) local and compressed mean-field observability, and (d) only local observability. We establish conditions under which the control problem of a given agent is equivalent to a fully observed MDP, as well as conditions under which the control problem is equivalent only to a POMDP. Building on the connection to MDPs, we prove the existence of perfect equilibrium among memoryless stationary policies under mean-field observability. Leveraging the connection to POMDPs, we prove convergence of learning iterates obtained by independent learning agents under any of the aforementioned observation channels. We interpret the limiting values as subjective value functions, which an agent believes to be relevant to its control problem. These subjective value functions are then used to propose subjective Q-equilibrium, a new solution concept for partially observed n-player mean-field games, whose existence is proved under mean-field or global observability. We provide a decentralized learning algorithm for partially observed n-player mean-field games, and we show that it drives play to subjective Q-equilibrium by adapting the recently developed theory of satisficing paths to allow for subjectivity.

Mean-Field Games With Finitely Many Players: Independent Learning and Subjectivity

TL;DR

A decentralized learning algorithm is provided for partially observed n-player mean-field games, and it is shown that it drives play to subjective Q-equilibrium by adapting the recently developed theory of satisficing paths to allow for subjectivity.

Abstract

Independent learners are agents that employ single-agent algorithms in multi-agent systems, intentionally ignoring the effect of other strategic agents. This paper studies mean-field games from a decentralized learning perspective, with two primary objectives: (i) to identify structure that can guide algorithm design, and (ii) to understand the emergent behaviour in systems of independent learners. We study a new model of partially observed mean-field games with finitely many players, local action observability, and a general observation channel for partial observations of the global state. Specific observation channels considered include (a) global observability, (b) local and mean-field observability, (c) local and compressed mean-field observability, and (d) only local observability. We establish conditions under which the control problem of a given agent is equivalent to a fully observed MDP, as well as conditions under which the control problem is equivalent only to a POMDP. Building on the connection to MDPs, we prove the existence of perfect equilibrium among memoryless stationary policies under mean-field observability. Leveraging the connection to POMDPs, we prove convergence of learning iterates obtained by independent learning agents under any of the aforementioned observation channels. We interpret the limiting values as subjective value functions, which an agent believes to be relevant to its control problem. These subjective value functions are then used to propose subjective Q-equilibrium, a new solution concept for partially observed n-player mean-field games, whose existence is proved under mean-field or global observability. We provide a decentralized learning algorithm for partially observed n-player mean-field games, and we show that it drives play to subjective Q-equilibrium by adapting the recently developed theory of satisficing paths to allow for subjectivity.
Paper Structure (46 sections, 29 theorems, 136 equations, 3 figures, 1 table, 3 algorithms)

This paper contains 46 sections, 29 theorems, 136 equations, 3 figures, 1 table, 3 algorithms.

Key Result

Lemma 3

For any initial measure $\nu \in \Delta (\textnormal{X})$ and any player $i \in \mathcal{N}$, the mapping $\bm{\pi} \mapsto J^i ( \bm{\pi} , \nu )$ is continuous on $\bm{\Pi}_{S}$.

Figures (3)

  • Figure 1: Local State Transition Probabilities for Example 1
  • Figure 2: Frequency of subjective $\epsilon$-equilibrium plotted against the exploration phase index, averaged over 250 trials.
  • Figure 3: Mean Number of players who are subjectively $\epsilon$-best-responding vs. exploration phase index, averaged over 250 trials.

Theorems & Definitions (58)

  • Definition 1
  • Definition 2
  • Lemma 3
  • Definition 4
  • Definition 5
  • Lemma 6
  • Definition 7
  • Theorem 8
  • Lemma 9
  • Lemma 10
  • ...and 48 more