Learning in Time-Varying Monotone Network Games with Dynamic Populations
Feras Al Taha, Kiran Rokade, Francesca Parise
TL;DR
This work addresses learning in multi-agent settings with time-varying networks and dynamic participation by modeling stage interactions via a random network with independent realizations. It shows that projected gradient dynamics converge almost surely and in mean-square to the Nash equilibrium of the expected-cost game defined by the operator $\tilde{F}$, and that the learned profile is an $\epsilon$-Nash equilibrium for each stage with high probability, accompanied by non-asymptotic regret guarantees. The results extend previous linear-quadratic analyses to the broader class of smooth, strongly monotone network games and provide rates and mean-square convergence under diminishing step sizes. This provides rigorous guidance for open, dynamic MAS design where network topology and participation fluctuate over time, enabling robust learning and performance guarantees under uncertainty.
Abstract
In this paper, we present a framework for multi-agent learning in a nonstationary dynamic network environment. More specifically, we examine projected gradient play in smooth monotone repeated network games in which the agents' participation and connectivity vary over time. We model this changing system with a stochastic network which takes a new independent realization at each repetition. We show that the strategy profile learned by the agents through projected gradient dynamics over the sequence of network realizations converges to a Nash equilibrium of the game in which players minimize their expected cost, almost surely and in the mean-square sense. We then show that the learned strategy profile is an almost Nash equilibrium of the game played by the agents at each stage of the repeated game with high probability. Using these two results, we derive non-asymptotic bounds on the regret incurred by the agents.
