Table of Contents
Fetching ...

Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis

Akshat Gupta

TL;DR

The paper investigates whether ChatGPT and GPT-4 can play no-limit Hold'em poker, focusing on the pre-flop decision when acting first in a 9-player table. Using both basic and GTO-oriented prompts, it analyzes decision matrices and hand ranges to measure proximity to game-theory optimal play. Results show that, despite strong domain knowledge, neither model achieves true GTO play: ChatGPT tends to be conservative (nit-like) while GPT-4 is overly aggressive, and prompting alone cannot reconcile their behavior with GTO. The work highlights how prompt design and model tendencies shape strategic play under uncertainty, with implications for deploying LLMs in strategic decision tasks and for guiding future alignment efforts. The findings also illustrate the practical challenges of translating theoretical game-theoretic optima into learned, text-based decision systems.

Abstract

Since the introduction of ChatGPT and GPT-4, these models have been tested across a large number of tasks. Their adeptness across domains is evident, but their aptitude in playing games, and specifically their aptitude in the realm of poker has remained unexplored. Poker is a game that requires decision making under uncertainty and incomplete information. In this paper, we put ChatGPT and GPT-4 through the poker test and evaluate their poker skills. Our findings reveal that while both models display an advanced understanding of poker, encompassing concepts like the valuation of starting hands, playing positions and other intricacies of game theory optimal (GTO) poker, both ChatGPT and GPT-4 are NOT game theory optimal poker players. Profitable strategies in poker are evaluated in expectations over large samples. Through a series of experiments, we first discover the characteristics of optimal prompts and model parameters for playing poker with these models. Our observations then unveil the distinct playing personas of the two models. We first conclude that GPT-4 is a more advanced poker player than ChatGPT. This exploration then sheds light on the divergent poker tactics of the two models: ChatGPT's conservativeness juxtaposed against GPT-4's aggression. In poker vernacular, when tasked to play GTO poker, ChatGPT plays like a nit, which means that it has a propensity to only engage with premium hands and folds a majority of hands. When subjected to the same directive, GPT-4 plays like a maniac, showcasing a loose and aggressive style of play. Both strategies, although relatively advanced, are not game theory optimal.

Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis

TL;DR

The paper investigates whether ChatGPT and GPT-4 can play no-limit Hold'em poker, focusing on the pre-flop decision when acting first in a 9-player table. Using both basic and GTO-oriented prompts, it analyzes decision matrices and hand ranges to measure proximity to game-theory optimal play. Results show that, despite strong domain knowledge, neither model achieves true GTO play: ChatGPT tends to be conservative (nit-like) while GPT-4 is overly aggressive, and prompting alone cannot reconcile their behavior with GTO. The work highlights how prompt design and model tendencies shape strategic play under uncertainty, with implications for deploying LLMs in strategic decision tasks and for guiding future alignment efforts. The findings also illustrate the practical challenges of translating theoretical game-theoretic optima into learned, text-based decision systems.

Abstract

Since the introduction of ChatGPT and GPT-4, these models have been tested across a large number of tasks. Their adeptness across domains is evident, but their aptitude in playing games, and specifically their aptitude in the realm of poker has remained unexplored. Poker is a game that requires decision making under uncertainty and incomplete information. In this paper, we put ChatGPT and GPT-4 through the poker test and evaluate their poker skills. Our findings reveal that while both models display an advanced understanding of poker, encompassing concepts like the valuation of starting hands, playing positions and other intricacies of game theory optimal (GTO) poker, both ChatGPT and GPT-4 are NOT game theory optimal poker players. Profitable strategies in poker are evaluated in expectations over large samples. Through a series of experiments, we first discover the characteristics of optimal prompts and model parameters for playing poker with these models. Our observations then unveil the distinct playing personas of the two models. We first conclude that GPT-4 is a more advanced poker player than ChatGPT. This exploration then sheds light on the divergent poker tactics of the two models: ChatGPT's conservativeness juxtaposed against GPT-4's aggression. In poker vernacular, when tasked to play GTO poker, ChatGPT plays like a nit, which means that it has a propensity to only engage with premium hands and folds a majority of hands. When subjected to the same directive, GPT-4 plays like a maniac, showcasing a loose and aggressive style of play. Both strategies, although relatively advanced, are not game theory optimal.
Paper Structure (17 sections, 10 figures)

This paper contains 17 sections, 10 figures.

Figures (10)

  • Figure 1: A 9-player poker table with the positions named. The order of action in the pre-flop betting round is from UTG to BB.
  • Figure 2: GTO pre-flop strategy in Raise First In spots rficharts. The red color shows raised hand, green show limped hands, and white shows folded hands.
  • Figure 3: ChatGPT's pre-flop strategy in Raise First in spots. The red color hands show raised hands, green color shows limped hands and yellow shows folded hands. This decision matrix is for short-type and ranked user prompt.
  • Figure 4: ChatGPT's pre-flop strategy in Raise-First-in using the short user prompt and ranked order of card presentation. Here, ChatGPT is specifically asked to play GTO poker.
  • Figure 5: ChatGPT action percentage as a function of position. Action options that ChatGPT has are raise, fold and limp. The positions on the x-axis are ordered in order of action on the table.
  • ...and 5 more figures