People use fast, goal-directed simulation to reason about novel games

Cedegao E. Zhang; Katherine M. Collins; Lionel Wong; Mauricio Barba; Adrian Weller; Joshua B. Tenenbaum

People use fast, goal-directed simulation to reason about novel games

Cedegao E. Zhang, Katherine M. Collins, Lionel Wong, Mauricio Barba, Adrian Weller, Joshua B. Tenenbaum

TL;DR

This paper addresses how people rapidly evaluate novel multi-agent problems by proposing an intuitive game theory framework that uses fast, bounded, goal-directed simulations and sample-based inference to predict game outcomes and enjoyment. The core method combines a one-step lookahead agent with a simple, general value function and partial game simulations to estimate probabilities for outcomes such as win, loss, or draw. Empirical results show that this approach closely tracks human judgments (R^2 ≈ 0.86) across 121 novel Connect-N style games, outperforming deeper search baselines and naive alternatives, while also linking fun judgments to measured fairness, challenge, and length. The work suggests a resource-rational mechanism by which people reason under uncertainty and highlights potential neurosymbolic extensions with language-grounded planning to broaden applicability.

Abstract

People can evaluate features of problems and their potential solutions well before we can effectively solve them. When considering a game we have never played, for instance, we might infer whether it is likely to be challenging, fair, or fun simply from hearing the game rules, prior to deciding whether to invest time in learning the game or trying to play it well. Many studies of game play have focused on optimality and expertise, characterizing how people and computational models play based on moderate to extensive search and after playing a game dozens (if not thousands or millions) of times. Here, we study how people reason about a range of simple but novel Connect-N style board games. We ask people to judge how fair and how fun the games are from very little experience: just thinking about the game for a minute or so, before they have ever actually played with anyone else, and we propose a resource-limited model that captures their judgments using only a small number of partial game simulations and almost no look-ahead search.

People use fast, goal-directed simulation to reason about novel games

TL;DR

Abstract

Paper Structure (10 sections, 3 equations, 5 figures, 1 table, 9 algorithms)

This paper contains 10 sections, 3 equations, 5 figures, 1 table, 9 algorithms.

Introduction
Intuitive Game Theory Computational Model
Game specifications and game reasoning queries
Estimating game outcomes by simulating goal-directed but search-limited players
Human and Model Experiments
Human game evaluations and game construction
Model game evaluations and alternative models
Results and Discussion
Conclusion and future directions
Acknowledgments

Figures (5)

Figure 1: (A) Design of 121 grid games varying the game environment, dynamics, and win conditions. (B) Our intuitive game theory model simulates bounded game play under a fast but general agent model to draw probabilistic inferences about novel games.
Figure 2: (A) Qualitative analysis of voluntary participant usage of an interactive grid "scratchpad" suggests that many participants spontaneously seem to simulate game play against imaginary opponents to reason about novel games, though these simulations are often far from optimal: the upper game could be provably drawn by first playing in the center and then mirroring opponent play; the lower game shows an inefficient win strategy. (B) Human game evaluations of game outcomes, including the probability that the game ends in a draw, that the first player wins, and the expected payoff; and a rating of how fun a game is, on the simple class of Tic-Tac-Toe extensions (winning with M in a row on an N by N board). (C) People can generate their own novel game variants that they generally perceive to be reasonably fun, and rate fun games as fair ones (with an expected balanced payoff of 0, rather than a biased one).
Figure 3: Expected payoff across models against humans-predicted payoff. Each point represents the payoff for one of the $n=121$ game stimuli. Error bars in the y-axis are over individual human judgments per game and depict standard error. The $R^2$ values in the parentheses indicate 95% CI.
Figure 4: Absolute difference between human- and model-predicted payoff, broken down by game category. Error bars depict 95% CI over games within each game category (number of games in each game category is depicted in \ref{['tab:stimuli']}).
Figure 5: Human game fun ratings correlate well with several distinct game features: entropy over game outcomes as predicted by participants themselves; outcome entropy predicted under our model; predicted advantage over a random agent given our model's game play; expected game length under our model; and LLM-based estimates given the game specification. Far right plots show correlations in a combined regression model fit to all features predicted under our model (entropy, advantage, and length); and additionally incorporating the LLM-based estimates. Error bars depict standard error. Numbers in parentheses indicate 95% CI.

People use fast, goal-directed simulation to reason about novel games

TL;DR

Abstract

People use fast, goal-directed simulation to reason about novel games

Authors

TL;DR

Abstract

Table of Contents

Figures (5)