Table of Contents
Fetching ...

Monopoly Deal: A Benchmark Environment for Bounded One-Sided Response Games

Will Wolf

TL;DR

The paper formalizes Bounded One-Sided Response Games (BORGs) and introduces a modified Monopoly Deal as a compact benchmark to study them. It demonstrates that standard Monte Carlo CFR with a minimal intent-based abstraction converges efficiently in this setting, without novel algorithmic extensions. A lightweight, full-stack platform unifies the environment, a parallel CFR runtime, and a human-playable web interface to support reproducible research. The work provides a practical testbed for bounded-response reasoning with strong implications for understanding sequential decision-making under bounded interruptions and imperfect information. Overall, the framework offers a tractable, interpretable, and extensible pathway for exploring bounded-control dynamics in real-world-inspired games.

Abstract

Card games are widely used to study sequential decision-making under uncertainty, with real-world analogues in negotiation, finance, and cybersecurity. These games typically fall into three categories based on the flow of control: strictly sequential (players alternate single actions), deterministic response (some actions trigger a fixed outcome), and unbounded reciprocal response (alternating counterplays are permitted). A less-explored but strategically rich structure is the bounded one-sided response, where a player's action briefly transfers control to the opponent, who must satisfy a fixed condition through one or more moves before the turn resolves. We term games featuring this mechanism Bounded One-Sided Response Games (BORGs). We introduce a modified version of Monopoly Deal as a benchmark environment that isolates this dynamic, where a Rent action forces the opponent to choose payment assets. The gold-standard algorithm, Counterfactual Regret Minimization (CFR), converges on effective strategies without novel algorithmic extensions. A lightweight full-stack research platform unifies the environment, a parallelized CFR runtime, and a human-playable web interface. The trained CFR agent and source code are available at https://monopolydeal.ai.

Monopoly Deal: A Benchmark Environment for Bounded One-Sided Response Games

TL;DR

The paper formalizes Bounded One-Sided Response Games (BORGs) and introduces a modified Monopoly Deal as a compact benchmark to study them. It demonstrates that standard Monte Carlo CFR with a minimal intent-based abstraction converges efficiently in this setting, without novel algorithmic extensions. A lightweight, full-stack platform unifies the environment, a parallel CFR runtime, and a human-playable web interface to support reproducible research. The work provides a practical testbed for bounded-response reasoning with strong implications for understanding sequential decision-making under bounded interruptions and imperfect information. Overall, the framework offers a tractable, interpretable, and extensible pathway for exploring bounded-control dynamics in real-world-inspired games.

Abstract

Card games are widely used to study sequential decision-making under uncertainty, with real-world analogues in negotiation, finance, and cybersecurity. These games typically fall into three categories based on the flow of control: strictly sequential (players alternate single actions), deterministic response (some actions trigger a fixed outcome), and unbounded reciprocal response (alternating counterplays are permitted). A less-explored but strategically rich structure is the bounded one-sided response, where a player's action briefly transfers control to the opponent, who must satisfy a fixed condition through one or more moves before the turn resolves. We term games featuring this mechanism Bounded One-Sided Response Games (BORGs). We introduce a modified version of Monopoly Deal as a benchmark environment that isolates this dynamic, where a Rent action forces the opponent to choose payment assets. The gold-standard algorithm, Counterfactual Regret Minimization (CFR), converges on effective strategies without novel algorithmic extensions. A lightweight full-stack research platform unifies the environment, a parallelized CFR runtime, and a human-playable web interface. The trained CFR agent and source code are available at https://monopolydeal.ai.

Paper Structure

This paper contains 45 sections, 12 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: System architecture: training and serving stacks. The training stack runs CFR self-play experiments and logs metrics and checkpoints. The trained policy is exported as JSON to the serving stack, where it is loaded into a FastAPI service backed by a database and accessed through a Next.js frontend.
  • Figure 2: Decline in maximum expected regret during training, demonstrating convergence of MCCFR under bounded one-sided response dynamics.
  • Figure 3: Win rate over time against baseline opponents (20 games played at each evaluation) during CFR training. Solid lines correspond to games where the agent always plays first; dashed lines indicate randomized starts. The model achieves near-perfect play against random opponents and competitive win rates against more sophisticated opponents.
  • Figure 4: Distribution of cumulative infoset update counts throughout training on a logarithmic scale. Typical information sets (50th percentile) are visited roughly 50 times during training, while the most common information sets (100th percentile) are visited roughly 2,000 times.
  • Figure 5: Median probability of abstract actions available to the target player throughout training. Actions that promote property building and retention are favored.
  • ...and 2 more figures