Table of Contents
Fetching ...

Observable Perfect Equilibrium

Sam Ganzfried

TL;DR

The paper addresses the problem of selecting among multiple Nash equilibria in sequential imperfect-information games and introduces observable perfect equilibrium (OPE) as a refinement that constrains trembles to actions observable by opponents. It formalizes OPE for two-player, one-step extensive-form imperfect-information games, proves existence in the two-player zero-sum setting, and provides a polynomial-time linear-programming approach to compute OPE, leveraging trembles consistent with observed play. Through a detailed no-limit poker example (the no-limit clairvoyance game with $n=2$), the authors show that OPE can yield different strategy recommendations than existing refinements such as EFTHPE, QPE, and OSQPE, highlighting its practical implications for AI agents in imperfect-information domains. The work argues that OPE aligns with rational behavior under observed mistakes, offers computational advantages, and can be extended to larger, multi-player, and non-zero-sum games, marking a promising direction for robust strategic reasoning in AI.

Abstract

While Nash equilibrium has emerged as the central game-theoretic solution concept, many important games contain several Nash equilibria and we must determine how to select between them in order to create real strategic agents. Several Nash equilibrium refinement concepts have been proposed and studied for sequential imperfect-information games, the most prominent being trembling-hand perfect equilibrium, quasi-perfect equilibrium, and recently one-sided quasi-perfect equilibrium. These concepts are robust to certain arbitrarily small mistakes, and are guaranteed to always exist; however, we argue that neither of these is the correct concept for developing strong agents in sequential games of imperfect information. We define a new equilibrium refinement concept for extensive-form games called observable perfect equilibrium in which the solution is robust over trembles in publicly-observable action probabilities (not necessarily over all action probabilities that may not be observable by opposing players). Observable perfect equilibrium correctly captures the assumption that the opponent is playing as rationally as possible given mistakes that have been observed (while previous solution concepts do not). We prove that observable perfect equilibrium is always guaranteed to exist, and demonstrate that it leads to a different solution than the prior extensive-form refinements in no-limit poker. We expect observable perfect equilibrium to be a useful equilibrium refinement concept for modeling many important imperfect-information games of interest in artificial intelligence.

Observable Perfect Equilibrium

TL;DR

The paper addresses the problem of selecting among multiple Nash equilibria in sequential imperfect-information games and introduces observable perfect equilibrium (OPE) as a refinement that constrains trembles to actions observable by opponents. It formalizes OPE for two-player, one-step extensive-form imperfect-information games, proves existence in the two-player zero-sum setting, and provides a polynomial-time linear-programming approach to compute OPE, leveraging trembles consistent with observed play. Through a detailed no-limit poker example (the no-limit clairvoyance game with ), the authors show that OPE can yield different strategy recommendations than existing refinements such as EFTHPE, QPE, and OSQPE, highlighting its practical implications for AI agents in imperfect-information domains. The work argues that OPE aligns with rational behavior under observed mistakes, offers computational advantages, and can be extended to larger, multi-player, and non-zero-sum games, marking a promising direction for robust strategic reasoning in AI.

Abstract

While Nash equilibrium has emerged as the central game-theoretic solution concept, many important games contain several Nash equilibria and we must determine how to select between them in order to create real strategic agents. Several Nash equilibrium refinement concepts have been proposed and studied for sequential imperfect-information games, the most prominent being trembling-hand perfect equilibrium, quasi-perfect equilibrium, and recently one-sided quasi-perfect equilibrium. These concepts are robust to certain arbitrarily small mistakes, and are guaranteed to always exist; however, we argue that neither of these is the correct concept for developing strong agents in sequential games of imperfect information. We define a new equilibrium refinement concept for extensive-form games called observable perfect equilibrium in which the solution is robust over trembles in publicly-observable action probabilities (not necessarily over all action probabilities that may not be observable by opposing players). Observable perfect equilibrium correctly captures the assumption that the opponent is playing as rationally as possible given mistakes that have been observed (while previous solution concepts do not). We prove that observable perfect equilibrium is always guaranteed to exist, and demonstrate that it leads to a different solution than the prior extensive-form refinements in no-limit poker. We expect observable perfect equilibrium to be a useful equilibrium refinement concept for modeling many important imperfect-information games of interest in artificial intelligence.
Paper Structure (4 sections, 13 theorems, 31 equations, 1 figure)

This paper contains 4 sections, 13 theorems, 31 equations, 1 figure.

Key Result

Theorem 1

Let $\sigma^*$ be a strategy profile of an $n$-player strategic-form game $G$. Then $\sigma^*$ is a trembling-hand perfect equilibrium if and only if $\sigma^*$ is a limit point of a sequence $\{\sigma(\epsilon)\}_{\epsilon \rightarrow 0}$ of totally mixed strategy profiles with the property that $\

Figures (1)

  • Figure 1: No-limit clairvoyance game with $n = 2$.

Theorems & Definitions (20)

  • Theorem 1
  • Definition 1
  • Proposition 1
  • proof
  • Lemma 1
  • Corollary 1
  • Lemma 2
  • Corollary 2
  • Proposition 2
  • Proposition 3
  • ...and 10 more