Faster Game Solving via Asymmetry of Step Sizes

Linjian Meng; Tianpei Yang; Youzhi Zhang; Zhenxing Ge; Yang Gao

Faster Game Solving via Asymmetry of Step Sizes

Linjian Meng, Tianpei Yang, Youzhi Zhang, Zhenxing Ge, Yang Gao

TL;DR

The paper tackles robustness challenges in Predictive CFR$^+$ (PCFR$^+$) when its predictive signals misalign with observed counterfactual regrets in two-player zero-sum imperfect-information extensive-form games. It introduce Asymmetric PCFR$^+$ (APCFR$^+$) with adaptive asymmetry between implicit and explicit regret updates to dampen the influence of prediction errors, plus Simple APCFR$^+$ (SAPCFR$^+$) for easy implementation. A theoretical regret bound shows the prediction inaccuracy impact is reduced by a factor of $1+\alpha^t$ under APCFR$^+$, and an automatic $\alpha^t$-learning mechanism is proposed; SAPCFR$^+$ fixes $\alpha^t$ to 2 for a single-line modification. Empirical evaluation across standard IIG benchmarks and HUNL subgames demonstrates superior or competitive convergence compared to PCFR$^+$ and related CFR variants, with APCFR$^+$ and SAPCFR$^+$ generally outperforming in most games and APDCFR$^+$ offering substantial gains when combined with discounting. The work suggests that asymmetric step-size updates are broadly applicable to CFR families, offering a practical route to more robust fast-converging solvers in complex strategic games.

Abstract

Counterfactual Regret Minimization (CFR) algorithms are widely used to compute a Nash equilibrium (NE) in two-player zero-sum imperfect-information extensive-form games (IIGs). Among them, Predictive CFR$^+$ (PCFR$^+$) is particularly powerful, achieving an exceptionally fast empirical convergence rate via the prediction in many games.However, the empirical convergence rate of PCFR$^+$ would significantly degrade if the prediction is inaccurate, leading to unstable performance on certain IIGs. To enhance the robustness of PCFR$^+$, we propose Asymmetric PCFR$^+$ (APCFR$^+$), which employs an adaptive asymmetry of step sizes between the updates of implicit and explicit accumulated counterfactual regrets to mitigate the impact of the prediction inaccuracy on convergence. We present a theoretical analysis demonstrating why APCFR$^+$ can enhance the robustness. To the best of our knowledge, we are the first to propose the asymmetry of step sizes, a simple yet novel technique that effectively improves the robustness of PCFR$^+$. Then, to reduce the difficulty of implementing APCFR$^+$ caused by the adaptive asymmetry, we propose a simplified version of APCFR$^+$ called Simple APCFR$^+$ (SAPCFR$^+$), which uses a fixed asymmetry of step sizes to enable only a single-line modification compared to original PCFR$^+$.Experimental results on five standard IIG benchmarks and two heads-up no-limit Texas Hold' em (HUNL) Subagems show that (i) both APCFR$^+$ and SAPCFR$^+$ outperform PCFR$^+$ in most of the tested games, (ii) SAPCFR$^+$ achieves a comparable empirical convergence rate with APCFR$^+$,and (iii) our approach can be generalized to improve other CFR algorithms, e.g., Discount CFR (DCFR).

Faster Game Solving via Asymmetry of Step Sizes

TL;DR

The paper tackles robustness challenges in Predictive CFR

(PCFR

) when its predictive signals misalign with observed counterfactual regrets in two-player zero-sum imperfect-information extensive-form games. It introduce Asymmetric PCFR

(APCFR

) with adaptive asymmetry between implicit and explicit regret updates to dampen the influence of prediction errors, plus Simple APCFR

(SAPCFR

) for easy implementation. A theoretical regret bound shows the prediction inaccuracy impact is reduced by a factor of

under APCFR

, and an automatic

-learning mechanism is proposed; SAPCFR

fixes

to 2 for a single-line modification. Empirical evaluation across standard IIG benchmarks and HUNL subgames demonstrates superior or competitive convergence compared to PCFR

and related CFR variants, with APCFR

and SAPCFR

generally outperforming in most games and APDCFR

offering substantial gains when combined with discounting. The work suggests that asymmetric step-size updates are broadly applicable to CFR families, offering a practical route to more robust fast-converging solvers in complex strategic games.

Abstract

(PCFR

) is particularly powerful, achieving an exceptionally fast empirical convergence rate via the prediction in many games.However, the empirical convergence rate of PCFR

would significantly degrade if the prediction is inaccurate, leading to unstable performance on certain IIGs. To enhance the robustness of PCFR

, we propose Asymmetric PCFR

(APCFR

), which employs an adaptive asymmetry of step sizes between the updates of implicit and explicit accumulated counterfactual regrets to mitigate the impact of the prediction inaccuracy on convergence. We present a theoretical analysis demonstrating why APCFR

can enhance the robustness. To the best of our knowledge, we are the first to propose the asymmetry of step sizes, a simple yet novel technique that effectively improves the robustness of PCFR

. Then, to reduce the difficulty of implementing APCFR

caused by the adaptive asymmetry, we propose a simplified version of APCFR

called Simple APCFR

(SAPCFR

), which uses a fixed asymmetry of step sizes to enable only a single-line modification compared to original PCFR

.Experimental results on five standard IIG benchmarks and two heads-up no-limit Texas Hold' em (HUNL) Subagems show that (i) both APCFR

and SAPCFR

outperform PCFR

in most of the tested games, (ii) SAPCFR

achieves a comparable empirical convergence rate with APCFR

,and (iii) our approach can be generalized to improve other CFR algorithms, e.g., Discount CFR (DCFR).

Faster Game Solving via Asymmetry of Step Sizes

TL;DR

Abstract

Faster Game Solving via Asymmetry of Step Sizes

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (6)