Table of Contents
Fetching ...

A Survey on Game Theory Optimal Poker

Prathamesh Sonawane, Arav Chheda

TL;DR

The paper surveys game-theoretic approaches to poker in imperfect-information settings, contrasting Game Theory Optimal (GTO) play with exploitative strategies and emphasizing practical abstractions. It reviews abstraction methods (notably bucketing and LP-based solutions) and betting models (discretized and LokiLoki) that make large games tractable, alongside result-oriented strategies such as CFR+. The discussion extends to multi-player settings (e.g., Pluribus) where purely symbolic GTO solutions are computationally intractable, highlighting ML-based exploitative approaches. The authors conclude with limitations and future directions, recommending more automated parameter tuning and hybrid ML-symbolic methods to scale to real-world, multi-agent poker with broader opponent modeling.

Abstract

Poker is in the family of imperfect information games unlike other games such as chess, connect four, etc which are perfect information game instead. While many perfect information games have been solved, no non-trivial imperfect information game has been solved to date. This makes poker a great test bed for Artificial Intelligence research. In this paper we firstly compare Game theory optimal poker to Exploitative poker. Secondly, we discuss the intricacies of abstraction techniques, betting models, and specific strategies employed by successful poker bots like Tartanian[1] and Pluribus[6]. Thirdly, we also explore 2-player vs multi-player games and the limitations that come when playing with more players. Finally, this paper discusses the role of machine learning and theoretical approaches in developing winning strategies and suggests future directions for this rapidly evolving field.

A Survey on Game Theory Optimal Poker

TL;DR

The paper surveys game-theoretic approaches to poker in imperfect-information settings, contrasting Game Theory Optimal (GTO) play with exploitative strategies and emphasizing practical abstractions. It reviews abstraction methods (notably bucketing and LP-based solutions) and betting models (discretized and LokiLoki) that make large games tractable, alongside result-oriented strategies such as CFR+. The discussion extends to multi-player settings (e.g., Pluribus) where purely symbolic GTO solutions are computationally intractable, highlighting ML-based exploitative approaches. The authors conclude with limitations and future directions, recommending more automated parameter tuning and hybrid ML-symbolic methods to scale to real-world, multi-agent poker with broader opponent modeling.

Abstract

Poker is in the family of imperfect information games unlike other games such as chess, connect four, etc which are perfect information game instead. While many perfect information games have been solved, no non-trivial imperfect information game has been solved to date. This makes poker a great test bed for Artificial Intelligence research. In this paper we firstly compare Game theory optimal poker to Exploitative poker. Secondly, we discuss the intricacies of abstraction techniques, betting models, and specific strategies employed by successful poker bots like Tartanian[1] and Pluribus[6]. Thirdly, we also explore 2-player vs multi-player games and the limitations that come when playing with more players. Finally, this paper discusses the role of machine learning and theoretical approaches in developing winning strategies and suggests future directions for this rapidly evolving field.
Paper Structure (13 sections, 4 figures)

This paper contains 13 sections, 4 figures.

Figures (4)

  • Figure 1: Results from the 2007 AAAI Computer Poker Competition. The players are listed in the order in which they placed in that competition. Each cell contains the average number of chips won by the player in the corresponding row against the player in the corresponding column, as well as the standard deviation. The numbers in the table reflect 20 pairwise matches each; in the AAAI competition a further 280 matches were conducted between each pair of the three top-ranked entries in order to get statistical significance, and Tartanian finished second.no-limit-poker-sandholm
  • Figure 2: Increasing sizes of imperfect-information games solved over time measured in unique information sets (i.e., after symmetries are removed). With the algorithm discussed (CFR+) being able to solve games of size upto $10^{13}$limit-holdem-solved
  • Figure 3: Transition probabilities visualized approximating-poker.
  • Figure 4: Example of lemonade stand game superhuman-ai. In this game, participants choose a location on a circular space, aiming to maximize distance from others. The game has numerous Nash equilibria, each representing uniform player distribution around the circle. However, when players individually select different equilibria without coordination, the resulting strategy typically does not form a Nash equilibrium. This concept is depicted through illustrations: the left shows three equilibria using distinct colors, while the right demonstrates the non-equilibrium outcome when players independently choose different equilibria.