Asymmetric Perturbation in Solving Bilinear Saddle-Point Optimization
Kenshi Abe, Mitsuki Sakamoto, Kaito Ariu, Atsushi Iwasaki
TL;DR
The paper addresses the challenge of fast, exact equilibrium computation in bilinear saddle-point problems by introducing an asymmetric payoff perturbation that perturbs only the $x$-player. This yields an equilibrium invariance property for sufficiently small perturbations and enables a gradient-based method, AsymP-GDA, with a linear last-iterate convergence rate to the original-game equilibrium; it also provides a parameter-free variant that retains the linear rate. The approach is extended to extensive-form games through sequence-form representation and a dilated Euclidean regularizer, yielding AsymP-DGDA with strong empirical performance across multiple games. Collectively, the results offer a practical, provably fast, and tuning-free pathway to stable equilibrium computation in both normal-form and extensive-form zero-sum settings, with potential extensions to broader game-theoretic and Markov-game contexts.
Abstract
This paper proposes an asymmetric perturbation technique for solving bilinear saddle-point optimization problems, commonly arising in minimax problems, game theory, and constrained optimization. Perturbing payoffs or values is known to be effective in stabilizing learning dynamics and equilibrium computation. However, it requires decreasing perturbation magnitudes to ensure convergence to an equilibrium in the underlying game, resulting in a slower rate. To overcome this, we introduce an asymmetric perturbation approach, where only one player's payoff function is perturbed. Exploiting the near-linear structure of bilinear problems, we show that, for a sufficiently small perturbation, the equilibrium strategy of the asymmetrically perturbed game coincides with an equilibrium strategy of the original game. Building on this property, we develop a perturbation-based learning algorithm with a linear last-iterate convergence rate to an equilibrium strategy of the original game, and we further show how to construct a parameter-free procedure that retains a linear rate. Finally, we empirically demonstrate fast convergence toward equilibria in both normal-form and extensive-form games.
