Learning in Time-Varying Monotone Network Games with Dynamic Populations

Feras Al Taha; Kiran Rokade; Francesca Parise

Learning in Time-Varying Monotone Network Games with Dynamic Populations

Feras Al Taha, Kiran Rokade, Francesca Parise

TL;DR

This work addresses learning in multi-agent settings with time-varying networks and dynamic participation by modeling stage interactions via a random network with independent realizations. It shows that projected gradient dynamics converge almost surely and in mean-square to the Nash equilibrium of the expected-cost game defined by the operator $\tilde{F}$, and that the learned profile is an $\epsilon$-Nash equilibrium for each stage with high probability, accompanied by non-asymptotic regret guarantees. The results extend previous linear-quadratic analyses to the broader class of smooth, strongly monotone network games and provide rates and mean-square convergence under diminishing step sizes. This provides rigorous guidance for open, dynamic MAS design where network topology and participation fluctuate over time, enabling robust learning and performance guarantees under uncertainty.

Abstract

In this paper, we present a framework for multi-agent learning in a nonstationary dynamic network environment. More specifically, we examine projected gradient play in smooth monotone repeated network games in which the agents' participation and connectivity vary over time. We model this changing system with a stochastic network which takes a new independent realization at each repetition. We show that the strategy profile learned by the agents through projected gradient dynamics over the sequence of network realizations converges to a Nash equilibrium of the game in which players minimize their expected cost, almost surely and in the mean-square sense. We then show that the learned strategy profile is an almost Nash equilibrium of the game played by the agents at each stage of the repeated game with high probability. Using these two results, we derive non-asymptotic bounds on the regret incurred by the agents.

Learning in Time-Varying Monotone Network Games with Dynamic Populations

TL;DR

, and that the learned profile is an

-Nash equilibrium for each stage with high probability, accompanied by non-asymptotic regret guarantees. The results extend previous linear-quadratic analyses to the broader class of smooth, strongly monotone network games and provide rates and mean-square convergence under diminishing step sizes. This provides rigorous guidance for open, dynamic MAS design where network topology and participation fluctuate over time, enabling robust learning and performance guarantees under uncertainty.

Abstract

Paper Structure (16 sections, 14 theorems, 52 equations)

This paper contains 16 sections, 14 theorems, 52 equations.

Introduction
Main contributions
Related works
Paper organization
Notation
Preliminaries
One shot game
Learning in static repeated games
Time-Varying Network Games
Random network model
Learning dynamics
Convergence and Regret Analysis
Convergence
Regret guarantees
Conclusion
...and 1 more sections

Key Result

Lemma 3.1

Suppose that Assumption a:strat_set holds. For each $i \in \mathcal{N}$, the cost function $\widetilde{J}_i(s)$ is well-defined. Moreover, for all $i \in \mathcal{N}$ and $s \in \mathcal{S}$, and

Theorems & Definitions (27)

Definition 1: $\epsilon$-Nash equilibrium
Lemma 3.1
proof
Proposition 3.2
proof
Proposition 3.3
proof
Lemma 4.1: robbins1971convergence
Theorem 4.2
proof
...and 17 more

Learning in Time-Varying Monotone Network Games with Dynamic Populations

TL;DR

Abstract

Learning in Time-Varying Monotone Network Games with Dynamic Populations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (27)