Table of Contents
Fetching ...

Asymmetric Perturbation in Solving Bilinear Saddle-Point Optimization

Kenshi Abe, Mitsuki Sakamoto, Kaito Ariu, Atsushi Iwasaki

TL;DR

The paper addresses the challenge of fast, exact equilibrium computation in bilinear saddle-point problems by introducing an asymmetric payoff perturbation that perturbs only the $x$-player. This yields an equilibrium invariance property for sufficiently small perturbations and enables a gradient-based method, AsymP-GDA, with a linear last-iterate convergence rate to the original-game equilibrium; it also provides a parameter-free variant that retains the linear rate. The approach is extended to extensive-form games through sequence-form representation and a dilated Euclidean regularizer, yielding AsymP-DGDA with strong empirical performance across multiple games. Collectively, the results offer a practical, provably fast, and tuning-free pathway to stable equilibrium computation in both normal-form and extensive-form zero-sum settings, with potential extensions to broader game-theoretic and Markov-game contexts.

Abstract

This paper proposes an asymmetric perturbation technique for solving bilinear saddle-point optimization problems, commonly arising in minimax problems, game theory, and constrained optimization. Perturbing payoffs or values is known to be effective in stabilizing learning dynamics and equilibrium computation. However, it requires decreasing perturbation magnitudes to ensure convergence to an equilibrium in the underlying game, resulting in a slower rate. To overcome this, we introduce an asymmetric perturbation approach, where only one player's payoff function is perturbed. Exploiting the near-linear structure of bilinear problems, we show that, for a sufficiently small perturbation, the equilibrium strategy of the asymmetrically perturbed game coincides with an equilibrium strategy of the original game. Building on this property, we develop a perturbation-based learning algorithm with a linear last-iterate convergence rate to an equilibrium strategy of the original game, and we further show how to construct a parameter-free procedure that retains a linear rate. Finally, we empirically demonstrate fast convergence toward equilibria in both normal-form and extensive-form games.

Asymmetric Perturbation in Solving Bilinear Saddle-Point Optimization

TL;DR

The paper addresses the challenge of fast, exact equilibrium computation in bilinear saddle-point problems by introducing an asymmetric payoff perturbation that perturbs only the -player. This yields an equilibrium invariance property for sufficiently small perturbations and enables a gradient-based method, AsymP-GDA, with a linear last-iterate convergence rate to the original-game equilibrium; it also provides a parameter-free variant that retains the linear rate. The approach is extended to extensive-form games through sequence-form representation and a dilated Euclidean regularizer, yielding AsymP-DGDA with strong empirical performance across multiple games. Collectively, the results offer a practical, provably fast, and tuning-free pathway to stable equilibrium computation in both normal-form and extensive-form zero-sum settings, with potential extensions to broader game-theoretic and Markov-game contexts.

Abstract

This paper proposes an asymmetric perturbation technique for solving bilinear saddle-point optimization problems, commonly arising in minimax problems, game theory, and constrained optimization. Perturbing payoffs or values is known to be effective in stabilizing learning dynamics and equilibrium computation. However, it requires decreasing perturbation magnitudes to ensure convergence to an equilibrium in the underlying game, resulting in a slower rate. To overcome this, we introduce an asymmetric perturbation approach, where only one player's payoff function is perturbed. Exploiting the near-linear structure of bilinear problems, we show that, for a sufficiently small perturbation, the equilibrium strategy of the asymmetrically perturbed game coincides with an equilibrium strategy of the original game. Building on this property, we develop a perturbation-based learning algorithm with a linear last-iterate convergence rate to an equilibrium strategy of the original game, and we further show how to construct a parameter-free procedure that retains a linear rate. Finally, we empirically demonstrate fast convergence toward equilibria in both normal-form and extensive-form games.

Paper Structure

This paper contains 48 sections, 17 theorems, 122 equations, 13 figures, 4 tables, 1 algorithm.

Key Result

Theorem 3.1

Assume that the perturbation strength $\mu$ is set such that $\mu\in (0, \frac{\alpha}{\max_{x\in \mathcal{X}}\left\|x\right\|})$, where $\alpha>0$ is a constant depending only the game instance. Then, the minimax strategy $x^{\mu}$ in the corresponding asymmetrically perturbed game eq:asymmetric_pe

Figures (13)

  • Figure 1: Symmetric Perturbation
  • Figure 2: Asymmetric Perturbation
  • Figure 4: The landscape of the objective function for player $x$ in asymmetrically perturbed games. The functions $g(x)$ and $g^{\mu}_{\mathrm{asym}}(x)$ are defined as $g(x) :=\max\limits_{y\in \mathcal{Y}}x^{\top}Ay$ and $g^{\mu}_{\mathrm{asym}}(x):=g(x) + \frac{\mu}{2}\|x\|^2$, respectively.
  • Figure 5: Trajectories of strategies for player $x$ using AsymP-GDA, SymP-GDA, and GDA. The game matrix $A$ is set to $A=[[0, 1, -3], [-1, 0, 1], [3, -1, 0]]$, and the strategy spaces are set to $\mathcal{X}=\mathcal{Y}=\Delta^3$. The red point represents the minimax strategy in the original game. The trajectories originate from different initial strategies, demonstrating the learning dynamics under each method.
  • Figure 6: Performance in extensive-form games. AsymP-DGDA performs two strategy updates per iteration for each player. For a fair comparison across methods with different per-iteration computational costs, we report the total number of strategy updates on the x-axis rather than iterations.
  • ...and 8 more figures

Theorems & Definitions (33)

  • Theorem 3.1
  • Remark 3.2: Invariance without knowing the game-dependent constant
  • Theorem 4.1
  • Corollary 4.2
  • Theorem 4.3
  • Remark 4.4: Technical challenge in proving Theorem \ref{['thm:lic_rate']}
  • Theorem B.1
  • Corollary B.2
  • Theorem C.1
  • proof : Proof of Theorem \ref{['thm:optimality_of_perturbed_equilibrium']}
  • ...and 23 more