Table of Contents
Fetching ...

On Tractable $Φ$-Equilibria in Non-Concave Games

Yang Cai, Constantinos Daskalakis, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng

TL;DR

The paper advances the theory and computation of equilibria in non-concave games by revitalizing the classical Φ-equilibrium concept. It shows that when the strategy-modification set Φ is finite, an efficient uncoupled online learner can achieve sublinear Φ-regret, ensuring convergence to an ε-approximate Φ-equilibrium, with a per-iteration cost that scales as √T and |Φ|. For infinite, locally-delimited deviations, the authors introduce three tractable local-modification families (proximal, convex combinations, and interpolation) and prove that Online Gradient Descent, often with optimism, achieves sublinear proximal-regret streams, yielding efficient convergence in the first-order stationary regime where ε = Ω(δ^2). They also establish hardness results showing that achieving ε = o(δ^2) is NP-hard, underscoring the regime’s tightness. Collectively, the work provides practical, decentralized learning scripts and sharp complexity boundaries for tractable Φ-equilibria in non-concave settings, with proximal-regret and proximal-operator-based deviations offering a versatile toolkit for non-convex multi-agent optimization.

Abstract

While Online Gradient Descent and other no-regret learning procedures are known to efficiently converge to a coarse correlated equilibrium in games where each agent's utility is concave in their own strategy, this is not the case when utilities are non-concave -- a common scenario in machine learning applications involving strategies parameterized by deep neural networks, or when agents' utilities are computed by neural networks, or both. Non-concave games introduce significant game-theoretic and optimization challenges: (i) Nash equilibria may not exist; (ii) local Nash equilibria, though they exist, are intractable; and (iii) mixed Nash, correlated, and coarse correlated equilibria generally have infinite support and are intractable. To sidestep these challenges, we revisit the classical solution concept of $Φ$-equilibria introduced by Greenwald and Jafari [2003], which is guaranteed to exist for an arbitrary set of strategy modifications $Φ$ even in non-concave games [Stolz and Lugosi, 2007]. However, the tractability of $Φ$-equilibria in such games remains elusive. In this paper, we initiate the study of tractable $Φ$-equilibria in non-concave games and examine several natural families of strategy modifications. We show that when $Φ$ is finite, there exists an efficient uncoupled learning algorithm that converges to the corresponding $Φ$-equilibria. Additionally, we explore cases where $Φ$ is infinite but consists of local modifications. We show that approximating local $Φ$-equilibria beyond the first-order stationary regime is computationally intractable. In contrast, within this regime, we show Online Gradient Descent efficiently converges to $Φ$-equilibria for several natural infinite families of modifications, including a new structural family of modifications inspired by the well-studied proximal operator.

On Tractable $Φ$-Equilibria in Non-Concave Games

TL;DR

The paper advances the theory and computation of equilibria in non-concave games by revitalizing the classical Φ-equilibrium concept. It shows that when the strategy-modification set Φ is finite, an efficient uncoupled online learner can achieve sublinear Φ-regret, ensuring convergence to an ε-approximate Φ-equilibrium, with a per-iteration cost that scales as √T and |Φ|. For infinite, locally-delimited deviations, the authors introduce three tractable local-modification families (proximal, convex combinations, and interpolation) and prove that Online Gradient Descent, often with optimism, achieves sublinear proximal-regret streams, yielding efficient convergence in the first-order stationary regime where ε = Ω(δ^2). They also establish hardness results showing that achieving ε = o(δ^2) is NP-hard, underscoring the regime’s tightness. Collectively, the work provides practical, decentralized learning scripts and sharp complexity boundaries for tractable Φ-equilibria in non-concave settings, with proximal-regret and proximal-operator-based deviations offering a versatile toolkit for non-convex multi-agent optimization.

Abstract

While Online Gradient Descent and other no-regret learning procedures are known to efficiently converge to a coarse correlated equilibrium in games where each agent's utility is concave in their own strategy, this is not the case when utilities are non-concave -- a common scenario in machine learning applications involving strategies parameterized by deep neural networks, or when agents' utilities are computed by neural networks, or both. Non-concave games introduce significant game-theoretic and optimization challenges: (i) Nash equilibria may not exist; (ii) local Nash equilibria, though they exist, are intractable; and (iii) mixed Nash, correlated, and coarse correlated equilibria generally have infinite support and are intractable. To sidestep these challenges, we revisit the classical solution concept of -equilibria introduced by Greenwald and Jafari [2003], which is guaranteed to exist for an arbitrary set of strategy modifications even in non-concave games [Stolz and Lugosi, 2007]. However, the tractability of -equilibria in such games remains elusive. In this paper, we initiate the study of tractable -equilibria in non-concave games and examine several natural families of strategy modifications. We show that when is finite, there exists an efficient uncoupled learning algorithm that converges to the corresponding -equilibria. Additionally, we explore cases where is infinite but consists of local modifications. We show that approximating local -equilibria beyond the first-order stationary regime is computationally intractable. In contrast, within this regime, we show Online Gradient Descent efficiently converges to -equilibria for several natural infinite families of modifications, including a new structural family of modifications inspired by the well-studied proximal operator.
Paper Structure (90 sections, 30 theorems, 115 equations, 3 figures, 1 table, 5 algorithms)

This paper contains 90 sections, 30 theorems, 115 equations, 3 figures, 1 table, 5 algorithms.

Key Result

Theorem 1

If each player $i \in [n]$ has $\Phi_i$-regret that is upper bounded by $\mathrm{Reg}^T_{\Phi_i}$, then their empirical distribution of strategy profiles played is an $(\max_{i\in [n]} \mathrm{Reg}^T_{\Phi_i} / T)$-approximate $\Phi$-equilibrium.

Figures (3)

  • Figure 1: The relationship between different solution concepts in non-concave games. An arrow from one solution concept to another means the former is contained in the latter. The dashed arrow from $\mathrm{conv}(\Phi(\delta))$-equilibria to $\Phi_{\mathrm{finite}}$-equilibria means the former is contained in the latter when $\Phi(\delta) = \Phi_{\mathrm{finite}}$.
  • Figure 2: Complexity of computing an $\varepsilon$-approximate $\delta$-local Nash equilibrium and $\varepsilon$-approximate $\Phi(\delta)$-equilibrium in $G$-Lipschitz and $L$-smooth $d$-dimensional games. We consider cases where $G, L= O(\mathop{\mathrm{poly}}\nolimits(d))$. The regime $\varepsilon \ge G\delta$ is trivial since the game is $G$-Lipschitz. The PPAD-hardness of approximate local Nash equilibrium follows from approximate (global) Nash equilibrium in bimatrix games due to linearity of the utility function chen2009settling. The PPAD-hardness of approximate local Nash equilibrium in two-player zero-sum games is proved in daskalakis2021complexity. The NP-hardness of $\varepsilon$-approximate $\Phi(\delta)$-equilibrium is proven for $\Phi_{\mathrm{All}}(\delta)$ (\ref{['thm:hardnessFOSall swap']}) and $\Phi_{\mathrm{Int}^+}(\delta)$ (\ref{['thm:hardnessFOS restricted']}) in \ref{['sec:first-order regime hardness']}. The NP-hardness of $\varepsilon$-approximate $\delta$-local Nash equilibrium is implied by \ref{['corollary:local maximizer hardness']}. The positive results for $\varepsilon$-approximate $\Phi(\delta)$-equilibrium in the first-order stationary regime hold for $\Phi_{\mathrm{prox}}(\delta)$ (\ref{['sec:proximal regret']}), $\Phi_{\mathrm{int}}(\delta)$ (\ref{['sec:phi-int-regret minization']}), and $\mathrm{conv}(\Phi(\delta))$ when $|\Phi(\delta)|$ is finite (\ref{['sec:convex-phi']}).
  • Figure 3: Illustration of $\phi_{\mathrm{proj}, v}(x)$ and $\phi_{\mathrm{beam}, v}(x)$

Theorems & Definitions (76)

  • Definition 1: $\Phi$-equilibrium greenwald2003generalstoltz2007learning
  • Definition 2: $\Phi$-regret
  • Theorem 1: greenwald2003general
  • Theorem 2
  • Remark 1
  • Corollary 1
  • Claim 1
  • Definition 3: $\delta$-local strategy modification
  • Lemma 1: From Non-Concave to Linear Losses in First-Order Stationary Regime
  • Definition 4: Proximal Operator
  • ...and 66 more