Perfect Information Monte Carlo with Postponing Reasoning

Jérôme Arjonilla; Abdallah Saffidine; Tristan Cazenave

Perfect Information Monte Carlo with Postponing Reasoning

Jérôme Arjonilla, Abdallah Saffidine, Tristan Cazenave

TL;DR

This work introduces Extended Perfect Information Monte Carlo (EPIMC), an online algorithm that postpones the perfect-information leaf evaluation to a depth $d$ to mitigate strategy fusion in imperfect-information games. It provides a formal treatment of strategy fusion, proves that increasing depth nonincreases fusion and can eliminate it in finite games, and demonstrates significant empirical gains on domains with private information (e.g., Dark Chess, Dark Hex, Phantom Tic-Tac-Toe) where PIMC struggles. EPIMC remains online and compatible with neural-network enhancements, subgame solvers, and general game playing, offering a practical route to improved planning under hidden information. The results highlight depth-aware determinization as a robust approach to outperform state-of-the-art online methods like IS-MCTS in several benchmarks and lay groundwork for future hybrids combining subgame solving and learned leaf evaluations.

Abstract

Imperfect information games, such as Bridge and Skat, present challenges due to state-space explosion and hidden information, posing formidable obstacles for search algorithms. Determinization-based algorithms offer a resolution by sampling hidden information and solving the game in a perfect information setting, facilitating rapid and effective action estimation. However, transitioning to perfect information introduces challenges, notably one called strategy fusion.This research introduces `Extended Perfect Information Monte Carlo' (EPIMC), an online algorithm inspired by the state-of-the-art determinization-based approach Perfect Information Monte Carlo (PIMC). EPIMC enhances the capabilities of PIMC by postponing the perfect information resolution, reducing alleviating issues related to strategy fusion. However, the decision to postpone the leaf evaluator introduces novel considerations, such as the interplay between prior levels of reasoning and the newly deferred resolution. In our empirical analysis, we investigate the performance of EPIMC across a range of games, with a particular focus on those characterized by varying degrees of strategy fusion. Our results demonstrate notable performance enhancements, particularly in games where strategy fusion significantly impacts gameplay. Furthermore, our research contributes to the theoretical foundation of determinization-based algorithms addressing challenges associated with strategy fusion.%, thereby enhancing our understanding of these algorithms within the context of imperfect information game scenarios.

Perfect Information Monte Carlo with Postponing Reasoning

TL;DR

This work introduces Extended Perfect Information Monte Carlo (EPIMC), an online algorithm that postpones the perfect-information leaf evaluation to a depth

to mitigate strategy fusion in imperfect-information games. It provides a formal treatment of strategy fusion, proves that increasing depth nonincreases fusion and can eliminate it in finite games, and demonstrates significant empirical gains on domains with private information (e.g., Dark Chess, Dark Hex, Phantom Tic-Tac-Toe) where PIMC struggles. EPIMC remains online and compatible with neural-network enhancements, subgame solvers, and general game playing, offering a practical route to improved planning under hidden information. The results highlight depth-aware determinization as a robust approach to outperform state-of-the-art online methods like IS-MCTS in several benchmarks and lay groundwork for future hybrids combining subgame solving and learned leaf evaluations.

Abstract

Paper Structure (29 sections, 9 figures, 2 algorithms)

This paper contains 29 sections, 9 figures, 2 algorithms.

Introduction
Notation and Background
Notation
Determinization-based algorithm
Strategy fusion
Perfect Information Monte Carlo
Extended PIMC
Theoretical foundation
Results
Games
Card game
Battleship
Dark Chess
Phantom Tic-Tac-Toe
Dark Hex
...and 14 more sections

Figures (9)

Figure 1: Variant of 'Rock-Paper-Scissors'. The red/green square/diamond is the first/second player and the dashed line represents worlds indistinguishable by the second player.
Figure 2: Winning rate of EPIMC when the depth from 1 to 3. The opponent is PIMC with one second of the budget.
Figure 3: Winning rate of EPIMC when the subgame is CFR+ or ISS. The opponent is PIMC with one second of the budget.
Figure 4: Winning rate of EPIMC when the perfect information leaf evaluator is Minimax or Random Rollout. The opponent is PIMC with one second of the budget.
Figure 5: Winning rate of online algorithms. The opponent is PIMC with one second of the budget.
...and 4 more figures

Theorems & Definitions (4)

Definition 1
proof
proof
proof

Perfect Information Monte Carlo with Postponing Reasoning

TL;DR

Abstract

Perfect Information Monte Carlo with Postponing Reasoning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)

Theorems & Definitions (4)