Enhancements for Real-Time Monte-Carlo Tree Search in General Video Game Playing

Dennis J. N. J. Soemers; Chiara F. Sironi; Torsten Schuster; Mark H. M. Winands

Enhancements for Real-Time Monte-Carlo Tree Search in General Video Game Playing

Dennis J. N. J. Soemers, Chiara F. Sironi, Torsten Schuster, Mark H. M. Winands

TL;DR

This work addresses GVGP by enhancing MCTS with eight techniques to handle diverse, unknown real-time games. It combines progressive history, n-gram selection, tree reuse, breadth-first initialization, loss avoidance, novelty pruning, knowledge-based evaluations, and deterministic game detection, achieving a notable increase in win rate from $31.0\%$ to $48.4\%$ across sixty games. The results demonstrate statistically significant improvements individually and especially when combined, bringing performance closer to top GVGAI entrants. The study highlights practical gains for real-time, general game-playing agents and outlines directions for parameter tuning and domain transfer. All mathematical notations are presented in $...$ format to ensure precise communication of the underlying methods and results.

Abstract

General Video Game Playing (GVGP) is a field of Artificial Intelligence where agents play a variety of real-time video games that are unknown in advance. This limits the use of domain-specific heuristics. Monte-Carlo Tree Search (MCTS) is a search technique for game playing that does not rely on domain-specific knowledge. This paper discusses eight enhancements for MCTS in GVGP; Progressive History, N-Gram Selection Technique, Tree Reuse, Breadth-First Tree Initialization, Loss Avoidance, Novelty-Based Pruning, Knowledge-Based Evaluations, and Deterministic Game Detection. Some of these are known from existing literature, and are either extended or introduced in the context of GVGP, and some are novel enhancements for MCTS. Most enhancements are shown to provide statistically significant increases in win percentages when applied individually. When combined, they increase the average win percentage over sixty different games from 31.0% to 48.4% in comparison to a vanilla MCTS implementation, approaching a level that is competitive with the best agents of the GVG-AI competition in 2015.

Enhancements for Real-Time Monte-Carlo Tree Search in General Video Game Playing

TL;DR

across sixty games. The results demonstrate statistically significant improvements individually and especially when combined, bringing performance closer to top GVGAI entrants. The study highlights practical gains for real-time, general game-playing agents and outlines directions for parameter tuning and domain transfer. All mathematical notations are presented in

format to ensure precise communication of the underlying methods and results.

Abstract

Paper Structure (15 sections, 5 equations, 7 figures, 6 tables)

This paper contains 15 sections, 5 equations, 7 figures, 6 tables.

Introduction
GVG-AI Framework and Competition
Monte-Carlo Tree Search
MCTS Enhancements for GVGP
Progressive History and N-Gram Selection Technique
Tree Reuse
Breadth-First Tree Initialization and Safety Prepruning
Loss Avoidance
Novelty-Based Pruning
Knowledge-Based Evaluations
Deterministic Game Detection
Experiments
Setup
Results
Conclusion and Future Work

Figures (7)

Figure 1: Example open-loop game tree. Nodes other than the root node can represent multiple possible states in nondeterministic games.
Figure 2: The four steps of an MCTS simulation. Adapted from Chaslot2008Progressive.
Figure 3: Tree Reuse in MCTS.
Figure 4: Example search tree. Dark nodes represent losing game states, and white nodes represent winning or neutral game states.
Figure 5: Example MCTS simulation with Loss Avoidance. The $X$ values in the last three nodes are evaluations of game states in those nodes. The dark node is a losing node.
...and 2 more figures

Enhancements for Real-Time Monte-Carlo Tree Search in General Video Game Playing

TL;DR

Abstract

Enhancements for Real-Time Monte-Carlo Tree Search in General Video Game Playing

Authors

TL;DR

Abstract

Table of Contents

Figures (7)