Geometric Structure and Polynomial-time Algorithm of Game Equilibria
Hongbo Sun, Chongkun Xia, Junbo Tan, Bo Yuan, Xueqian Wang, Bin Liang
TL;DR
The paper reframes game equilibrium computation as a two-subproblem optimization over policy and value, introducing unbiased KKT conditions and the equilibrium bundle to capture all perfect equilibria of dynamic games. A primal-dual unbiased interior-point method is recast as a line search on the equilibrium bundle, supplemented by dynamic programming in the policy cone to obtain a convergent, polynomial-time scheme. This hybrid approach yields an FPTAS for weak ε-approximations of perfect equilibria, which implies PPAD=FP, supported by theoretical oddness/existence results and experimental validation on thousands of dynamic games. The framework provides a scalable, model-agnostic pathway to robust multi-agent planning and learning, with potential to mitigate non-stationarity and multiagent curse in MARL while offering deep links to computational complexity theory.
Abstract
Whether a PTAS (polynomial-time approximation scheme) exists for game equilibria has been an open question, and its absence has indications and consequences in three fields: the practicality of methods in algorithmic game theory, non-stationarity and curse of multiagency in MARL (multi-agent reinforcement learning), and the tractability of PPAD in computational complexity theory. In this paper, we formalize the game equilibrium problem as an optimization problem that splits into two subproblems with respect to policy and value function, which are solved respectively by interior point method and dynamic programming. Combining these two parts, we obtain an FPTAS (fully PTAS) for the weak approximation (approximating to an $ε$-equilibrium) of any perfect equilibrium of any dynamic game, implying PPAD=FP since the weak approximation problem is PPAD-complete. In addition, we introduce a geometric object called equilibrium bundle, regarding which, first, perfect equilibria of dynamic games are formalized as zero points of its canonical section, second, the hybrid iteration of dynamic programming and interior point method is formalized as a line search on it, third, it derives the existence and oddness theorems as an extension of those of Nash equilibria. In experiment, the line search process is animated, and the method is tested on 2000 randomly generated dynamic games where it converges to a perfect equilibrium in every single case.
