Beyond Nash Equilibrium: Achieving Bayesian Perfect Equilibrium with Belief Update Fictitious Play
Qi Ju, Zhemei Fang, Yunfeng Luo
TL;DR
This paper introduces Belief Update Fictitious Play (BUFP), a belief-aware extension of FP designed to reach Bayesian Perfect Equilibrium ($BPE$) in extensive-form games with incomplete information. BUFP integrates belief updates inspired by Bayesian Action Decoder with generalized weakened fictitious play (GWFP) and employs adjustable iteration stepsizes to converge toward both Nash Equilibrium ($NE$) and $BPE$, demonstrating notable gains over CFR in dominated-strategy settings on 5-Leduc poker. The authors prove BUFP(EF) aligns with GWFP under Extensible Form Fictitious Play stepsizes and validate convergence through theoretical conditions and empirical experiments, including Kuhn and Leduc poker. They also provide a public-code release and discuss future work to broaden BUFP with additional FP variants and efficiency improvements, expanding its applicability to strategic decision-making under irrational or suboptimal opponent behavior.
Abstract
In the domain of machine learning and game theory, the quest for Nash Equilibrium (NE) in extensive-form games with incomplete information is challenging yet crucial for enhancing AI's decision-making support under varied scenarios. Traditional Counterfactual Regret Minimization (CFR) techniques excel in navigating towards NE, focusing on scenarios where opponents deploy optimal strategies. However, the essence of machine learning in strategic game play extends beyond reacting to optimal moves; it encompasses aiding human decision-making in all circumstances. This includes not only crafting responses to optimal strategies but also recovering from suboptimal decisions and capitalizing on opponents' errors. Herein lies the significance of transitioning from NE to Bayesian Perfect Equilibrium (BPE), which accounts for every possible condition, including the irrationality of opponents. To bridge this gap, we propose Belief Update Fictitious Play (BUFP), which innovatively blends fictitious play with belief to target BPE, a more comprehensive equilibrium concept than NE. Specifically, through adjusting iteration stepsizes, BUFP allows for strategic convergence to both NE and BPE. For instance, in our experiments, BUFP(EF) leverages the stepsize of Extensive Form Fictitious Play (EFFP) to achieve BPE, outperforming traditional CFR by securing a 48.53\% increase in benefits in scenarios characterized by dominated strategies.
