Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics

Natalia Koliou; George Vouros

Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics

Natalia Koliou, George Vouros

TL;DR

This paper tackles the instability of Nash equilibria in dynamic multi-agent settings by turning dynamic games into empirical games over strategy profiles and applying the evolutionary framework $α$-Rank to identify long-run stable joint policies. It defines policy styles as strategies, trains CNN/CNN-based policies to realize these styles via Deep Q-Learning, and builds an empirical payoff matrix through extensive simulations of a stochastic Graph Coloring Game. The $α$-Rank analysis yields a stationary distribution over strategy profiles, revealing a maximum common component (MCC) such that profiles like $(WL, CA)$ dominate, while classical Nash equilibria may be non-dominant in long-run dynamics; the method also provides a descriptive framework via the response graph for interpretability. The findings demonstrate that stability and performance in dynamic, multi-agent contexts can be transparently ranked and explained, with implications for designing robust, independently trained co-players and extending to more complex, real-world scenarios.

Abstract

Game-theoretic solution concepts, such as the Nash equilibrium, have been key to finding stable joint actions in multi-player games. However, it has been shown that the dynamics of agents' interactions, even in simple two-player games with few strategies, are incapable of reaching Nash equilibria, exhibiting complex and unpredictable behavior. Instead, evolutionary approaches can describe the long-term persistence of strategies and filter out transient ones, accounting for the long-term dynamics of agents' interactions. Our goal is to identify agents' joint strategies that result in stable behavior, being resistant to changes, while also accounting for agents' payoffs, in dynamic games. Towards this goal, and building on previous results, this paper proposes transforming dynamic games into their empirical forms by considering agents' strategies instead of agents' actions, and applying the evolutionary methodology $α$-Rank to evaluate and rank strategy profiles according to their long-term dynamics. This methodology not only allows us to identify joint strategies that are strong through agents' long-term interactions, but also provides a descriptive, transparent framework regarding the high ranking of these strategies. Experiments report on agents that aim to collaboratively solve a stochastic version of the graph coloring problem. We consider different styles of play as strategies to define the empirical game, and train policies realizing these strategies, using the DQN algorithm. Then we run simulations to generate the payoff matrix required by $α$-Rank to rank joint strategies.

Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics

TL;DR

This paper tackles the instability of Nash equilibria in dynamic multi-agent settings by turning dynamic games into empirical games over strategy profiles and applying the evolutionary framework

-Rank to identify long-run stable joint policies. It defines policy styles as strategies, trains CNN/CNN-based policies to realize these styles via Deep Q-Learning, and builds an empirical payoff matrix through extensive simulations of a stochastic Graph Coloring Game. The

-Rank analysis yields a stationary distribution over strategy profiles, revealing a maximum common component (MCC) such that profiles like

dominate, while classical Nash equilibria may be non-dominant in long-run dynamics; the method also provides a descriptive framework via the response graph for interpretability. The findings demonstrate that stability and performance in dynamic, multi-agent contexts can be transparently ranked and explained, with implications for designing robust, independently trained co-players and extending to more complex, real-world scenarios.

Abstract

-Rank to evaluate and rank strategy profiles according to their long-term dynamics. This methodology not only allows us to identify joint strategies that are strong through agents' long-term interactions, but also provides a descriptive, transparent framework regarding the high ranking of these strategies. Experiments report on agents that aim to collaboratively solve a stochastic version of the graph coloring problem. We consider different styles of play as strategies to define the empirical game, and train policies realizing these strategies, using the DQN algorithm. Then we run simulations to generate the payoff matrix required by

-Rank to rank joint strategies.

Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics

TL;DR

Abstract

Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)