Approximating Nash Equilibria in Normal-Form Games via Stochastic Optimization
Ian Gemp, Luke Marris, Georgios Piliouras
TL;DR
This work tackles the computational bottleneck of finding approximate Nash equilibria in $n$-player, general-sum normal-form games by recasting NE computation as a stochastic non-convex optimization problem. The authors introduce a novel loss $\mathcal{L}^{\tau}(\boldsymbol{x})$ that admits unbiased Monte Carlo estimation and is Lipschitz and bounded, enabling efficient optimization with SGD and bandit-based methods. They establish a theoretical connection between the loss and exploitability, extend to entropy-regularized (quantal response) equilibria, and derive convergence guarantees via X-armed bandits and StoSOO under the condition of polymatrix-isolated equilibria; their experiments show SGD can outperform prior baselines in some settings. The results open a scalable route to approximate equilibria in large, multi-agent systems and suggest future work on extensive-form games and broader optimization techniques.
Abstract
We propose the first loss function for approximate Nash equilibria of normal-form games that is amenable to unbiased Monte Carlo estimation. This construction allows us to deploy standard non-convex stochastic optimization techniques for approximating Nash equilibria, resulting in novel algorithms with provable guarantees. We complement our theoretical analysis with experiments demonstrating that stochastic gradient descent can outperform previous state-of-the-art approaches.
