Modeling Other Players with Bayesian Beliefs for Games with Incomplete Information

Zuyuan Zhang; Mahdi Imani; Tian Lan

Modeling Other Players with Bayesian Beliefs for Games with Incomplete Information

Zuyuan Zhang, Mahdi Imani, Tian Lan

TL;DR

The paper tackles decision-making in Bayesian games with incomplete information by marrying Bayesian belief updates with counterfactual regret minimization. It introduces Bayesian-CFR, which uses a kernel-density posterior to estimate players' type distributions and defines Bayesian regret to drive equilibrium computation, with a theoretical regret bound that includes a time-varying term $\Delta_{\Theta}^T$. Extensions to Bayesian CFR+ and Deep Bayesian CFR demonstrate scalable performance, aided by type-aware neural architectures and accumulated Bayesian regret. Empirical evaluation in Texas Hold'em shows substantial exploitability reductions versus traditional CFR baselines, validating both the methodology and its practical impact for reasoning about other agents' hidden types. Overall, the work provides a rigorous, scalable framework for solving Bayesian Nash Equilibria in extensive-form games under partial information.

Abstract

Bayesian games model interactive decision-making where players have incomplete information -- e.g., regarding payoffs and private data on players' strategies and preferences -- and must actively reason and update their belief models (with regard to such information) using observation and interaction history. Existing work on counterfactual regret minimization have shown great success for games with complete or imperfect information, but not for Bayesian games. To this end, we introduced a new CFR algorithm: Bayesian-CFR and analyze its regret bound with respect to Bayesian Nash Equilibria in Bayesian games. First, we present a method for updating the posterior distribution of beliefs about the game and other players' types. The method uses a kernel-density estimate and is shown to converge to the true distribution. Second, we define Bayesian regret and present a Bayesian-CFR minimization algorithm for computing the Bayesian Nash equilibrium. Finally, we extend this new approach to other existing algorithms, such as Bayesian-CFR+ and Deep Bayesian CFR. Experimental results show that our proposed solutions significantly outperform existing methods in classical Texas Hold'em games.

Modeling Other Players with Bayesian Beliefs for Games with Incomplete Information

TL;DR

. Extensions to Bayesian CFR+ and Deep Bayesian CFR demonstrate scalable performance, aided by type-aware neural architectures and accumulated Bayesian regret. Empirical evaluation in Texas Hold'em shows substantial exploitability reductions versus traditional CFR baselines, validating both the methodology and its practical impact for reasoning about other agents' hidden types. Overall, the work provides a rigorous, scalable framework for solving Bayesian Nash Equilibria in extensive-form games under partial information.

Abstract

Paper Structure (22 sections, 12 theorems, 50 equations, 3 figures, 2 tables, 4 algorithms)

This paper contains 22 sections, 12 theorems, 50 equations, 3 figures, 2 tables, 4 algorithms.

Introduction
Background
Extensive-Form Bayesian Games
Counterfactual Regret Minimization in imperfect game
Bayesian-CFR
Update Posterior: Evidence from the other players
Bayesian-CFR Minimization
Extend Bayesian Counterfactual Regret Minimization to other algorithms
Bayesian CFR+
Bayesian Deep CFR
Numerical Evaluation
Evaluation Environment: Texas hold'em.
Evaluation against baselines.
Conclusion
Proof of Bayesion
...and 7 more sections

Key Result

Lemma 3.3

mandyam2023kernel Let $w_{m},w^{'}_{m} > 0$ be the bandwidths chosen to estimate the joint probability and marginal probability, respectively. with $mw_{m}^{d/2}\rightarrow \infty$ and $mw_{m}^{',d^{'}/2}\rightarrow \infty$ as $m\rightarrow\infty$, where $d$ is the dimension of $(h,\theta)$ and $d^{

Figures (3)

Figure 1: A comparison of our proposed Bayesian-CFR algorithms with baselines, including CFR zinkevich2007regret, CFR+ tammelin2014solving, MCCFR lanctot2009monte, Deep CFR brown2019deep, and DQN mnih2013playing in Texas hold’em. The top row shows Bayesian games with pure-type players, and the bottom row shows Bayesian games with mixed-type players. A smaller exploitability implies closer "distance" to the Nash Bayesian Equilibrium. Our Bayesian-CFR algorithms significantly outperforms all baselines.
Figure 2: Exploitability under three player's types: normal (left), conservative (middle) and aggressive (right).
Figure 3: This experiment focuses on players of the conservative type. From left to right, the baseline methods used are CFR, CFR+, and Deep CFR. Within each experiment, there are several variations: BCFR, BCFR with knowledge of the player's type, BCFR using uniformly distributed player's type, CFR, and CFR with knowledge of the player's type.

Theorems & Definitions (19)

Lemma 3.3
Theorem 3.5
Theorem 3.6
Theorem 3.7
Theorem 4.1
Lemma A.1
proof
proof
proof
Lemma B.1
...and 9 more

Modeling Other Players with Bayesian Beliefs for Games with Incomplete Information

TL;DR

Abstract

Modeling Other Players with Bayesian Beliefs for Games with Incomplete Information

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (19)