Computing Perfect Bayesian Equilibria, with Application to Empirical Game-Theoretic Analysis
Christine Konicki, Mithun Chakraborty, Michael P. Wellman
TL;DR
The paper tackles computing Perfect Bayesian Equilibria (PBE) in two-player imperfect-information extensive-form games by introducing PBE-CFR, a scalable Counterfactual Regret Minimization (CFR)–based method that enforces AGM-consistency between strategies and beliefs. It achieves this by computing believed utilities $U^B$ with consistent beliefs and by updating beliefs via UpdateBeliefs using a plausibility order, with theoretical guarantees of polynomial-space/time complexity and convergence to PBE in two-player zero-sum settings. Empirical evaluation on general-sum game classes and a TE-PSRO framework shows PBE-CFR yields high-quality equilibria and improves TE-PSRO model quality compared to unrefined Nash baselines, though performance depends on the game’s information structure. The work demonstrates a practical path to using refined equilibria for strategy exploration in empirical game-theoretic analysis and highlights contexts where PBE provides advantages over NE-based MSSs.
Abstract
Perfect Bayesian Equilibrium (PBE) is a refinement of the Nash equilibrium for imperfect-information extensive-form games (EFGs) that enforces consistency between the two components of a solution: agents' strategy profile describing their decisions at information sets and the belief system quantifying their uncertainty over histories within an information set. We present a scalable approach for computing a PBE of an arbitrary two-player EFG. We adopt the definition of PBE enunciated by Bonanno in 2011 using a consistency concept based on the theory of belief revision due to Alchourrón, Gärdenfors, and Makinson. Our algorithm for finding a PBE is an adaptation of Counterfactual Regret Minimization (CFR) that minimizes the expected regret at each information set given a belief system, while maintaining the necessary consistency criteria. We prove that our algorithm is correct for two-player zero-sum games and has a reasonable slowdown in time-complexity relative to classical CFR given the additional computation needed for refinement. We also experimentally demonstrate the competent performance of PBE-CFR in terms of equilibrium quality and running time on medium-to-large non-zero-sum EFGs. Finally, we investigate the effectiveness of using PBE for strategy exploration in empirical game-theoretic analysis. Specifically, we compute PBE as a meta-strategy solver (MSS) in a tree-exploiting variant of Policy Space Response Oracles (TE-PSRO). Our experiments show that PBE as an MSS leads to higher-quality empirical EFG models with complex imperfect information structures compared to MSSs based on an unrefined Nash equilibrium.
