Neural Population Learning beyond Symmetric Zero-sum Games
Siqi Liu, Luke Marris, Marc Lanctot, Georgios Piliouras, Joel Z. Leibo, Nicolas Heess
TL;DR
This work introduces NeuPL-JPSRO, a scalable neural population learning algorithm that converges to a NF $CCE$ in n-player general-sum games by combining strategy embeddings, distillation, and regularisation with best-response learning. It extends prior JPSRO by sharing representations across policies to enable skill transfer and online adaptation, while employing a payoff-estimator network to efficiently evaluate metagame payoffs. Empirical results in OpenSpiel, MuJoCo cheetah-run, and multi-agent capture-the-flag demonstrate convergence toward $CCE$ and the emergence of coordinated, transferable skills, even under partial observability. The approach offers a practical pathway to solving real-world heterogeneous-agent interactions with mixed motives, balancing tractability, scalability, and convergence guarantees. Overall, NeuPL-JPSRO advances equilibrium-focused multiagent learning by marrying game-theoretic guarantees with deep representation learning and transfer across complex domains.
Abstract
We study computationally efficient methods for finding equilibria in n-player general-sum games, specifically ones that afford complex visuomotor skills. We show how existing methods would struggle in this setting, either computationally or in theory. We then introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated Equilibrium (CCE) of the game. We show empirical convergence in a suite of OpenSpiel games, validated rigorously by exact game solvers. We then deploy NeuPL-JPSRO to complex domains, where our approach enables adaptive coordination in a MuJoCo control domain and skill transfer in capture-the-flag. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality, paving the way towards solving real-world games between heterogeneous players with mixed motives.
