Table of Contents
Fetching ...

Statistical Equilibrium of Optimistic Beliefs

Yu Gui, Bahar Taşkesen

TL;DR

The paper introduces the Statistical Equilibrium of Optimistic Beliefs ($SE$-$OB$) for mixed extensions of finite normal-form games, unifying Nash equilibrium and Quantal Response Equilibria through optimistic perturbation beliefs and per-action risk preferences. Players choose perturbation distributions from prescribed belief sets to maximize their top perturbed payoff, yielding a semi-parametric equilibrium framework that reduces to Nash or QRE under singleton beliefs or Fréchet ambiguity sets, respectively. A learning algorithm with zeroth-order feedback is developed, and the authors prove its convergence to SE-OB under suitable conditions, with the dynamic belief formation providing smoothing that stabilizes convergence. The approach delivers the first generic convergent algorithm for general-form structural QRE beyond logit-QRE, and it highlights the roles of risk sensitivity and optimistic belief formation in steering equilibrium attainment and mitigating overly aggressive play in repeated interactions.

Abstract

We introduce the Statistical Equilibrium of Optimistic Beliefs (SE-OB) for the mixed extension of finite normal-form games, drawing insights from discrete choice theory. Departing from the conventional best responders of Nash equilibrium and the better responders of quantal response equilibrium, we reconceptualize player behavior as that of optimistic better responders. In this setting, the players assume that their expected payoffs are subject to random perturbations, and form optimistic beliefs by selecting the distribution of perturbations that maximizes their highest anticipated payoffs among belief sets. In doing so, SE-OB subsumes and extends the existing equilibria concepts. The player's view of the existence of perturbations in their payoffs reflects an inherent risk sensitivity, and thus, each player is equipped with a risk-preference function for every action. We demonstrate that every Nash equilibrium of a game, where expected payoffs are regularized with the risk-preference functions of the players, corresponds to an SE-OB in the original game, provided that the belief sets coincide with the feasible set of a multi-marginal optimal transport problem with marginals determined by risk-preference functions. Building on this connection, we propose an algorithm for repeated games among risk-sensitive players under optimistic beliefs when only zeroth-order feedback is available. We prove that, under appropriate conditions, the algorithm converges to an SE-OB. Our convergence analysis offers key insights into the strategic behaviors for equilibrium attainment: a player's risk sensitivity enhances equilibrium stability, while forming optimistic beliefs in the face of ambiguity helps to mitigate overly aggressive strategies over time. As a byproduct, our approach delivers the first generic convergent algorithm for general-form structural QRE beyond the classical logit-QRE.

Statistical Equilibrium of Optimistic Beliefs

TL;DR

The paper introduces the Statistical Equilibrium of Optimistic Beliefs (-) for mixed extensions of finite normal-form games, unifying Nash equilibrium and Quantal Response Equilibria through optimistic perturbation beliefs and per-action risk preferences. Players choose perturbation distributions from prescribed belief sets to maximize their top perturbed payoff, yielding a semi-parametric equilibrium framework that reduces to Nash or QRE under singleton beliefs or Fréchet ambiguity sets, respectively. A learning algorithm with zeroth-order feedback is developed, and the authors prove its convergence to SE-OB under suitable conditions, with the dynamic belief formation providing smoothing that stabilizes convergence. The approach delivers the first generic convergent algorithm for general-form structural QRE beyond logit-QRE, and it highlights the roles of risk sensitivity and optimistic belief formation in steering equilibrium attainment and mitigating overly aggressive play in repeated interactions.

Abstract

We introduce the Statistical Equilibrium of Optimistic Beliefs (SE-OB) for the mixed extension of finite normal-form games, drawing insights from discrete choice theory. Departing from the conventional best responders of Nash equilibrium and the better responders of quantal response equilibrium, we reconceptualize player behavior as that of optimistic better responders. In this setting, the players assume that their expected payoffs are subject to random perturbations, and form optimistic beliefs by selecting the distribution of perturbations that maximizes their highest anticipated payoffs among belief sets. In doing so, SE-OB subsumes and extends the existing equilibria concepts. The player's view of the existence of perturbations in their payoffs reflects an inherent risk sensitivity, and thus, each player is equipped with a risk-preference function for every action. We demonstrate that every Nash equilibrium of a game, where expected payoffs are regularized with the risk-preference functions of the players, corresponds to an SE-OB in the original game, provided that the belief sets coincide with the feasible set of a multi-marginal optimal transport problem with marginals determined by risk-preference functions. Building on this connection, we propose an algorithm for repeated games among risk-sensitive players under optimistic beliefs when only zeroth-order feedback is available. We prove that, under appropriate conditions, the algorithm converges to an SE-OB. Our convergence analysis offers key insights into the strategic behaviors for equilibrium attainment: a player's risk sensitivity enhances equilibrium stability, while forming optimistic beliefs in the face of ambiguity helps to mitigate overly aggressive strategies over time. As a byproduct, our approach delivers the first generic convergent algorithm for general-form structural QRE beyond the classical logit-QRE.

Paper Structure

This paper contains 17 sections, 12 theorems, 50 equations, 1 algorithm.

Key Result

Proposition 4.2

For some $b\in \mathbb R$, if $\mathcal{B}_j = \{\otimes_{i=1}^N \delta_{b \cdot \boldsymbol{1}}\}$ for all $j\in[M]$, then $\boldsymbol{P}= (\boldsymbol{p}_1, \ldots, \boldsymbol{p}_M)$, forms an SE-OB of the game $\mathcal{G}([M], \Delta^N, (u_j)_{j \in [M]})$ if and only if it satisfies eq:NE of

Theorems & Definitions (29)

  • Definition 3.1: Nash equilibrium
  • Definition 3.2: Quantal Response Equilibrium
  • Example 3.3: Logit-QRE mckelvey1995quantal
  • Definition 4.1: Statistical Equilibrium of Optimistic Beliefs
  • Proposition 4.2
  • Proposition 4.3
  • Lemma 4.4
  • Definition 4.5: Smooth expected payoffs
  • Lemma 4.6
  • Theorem 4.7
  • ...and 19 more