Table of Contents
Fetching ...

Ludax: A GPU-Accelerated Domain Specific Language for Board Games

Graham Todd, Alexander G. Padula, Dennis J. N. J. Soemers, Julian Togelius

Abstract

Games have long been used as benchmarks and testing environments for research in artificial intelligence. A key step in supporting this research was the development of game description languages: frameworks that compile domain-specific code into playable and simulatable game environments, allowing researchers to generalize their algorithms and approaches across multiple games without having to manually implement each one. More recently, progress in reinforcement learning (RL) has been largely driven by advances in hardware acceleration. Libraries like JAX allow practitioners to take full advantage of cutting-edge computing hardware, often speeding up training and testing by orders of magnitude. Here, we present a synthesis of these strands of research: a domain-specific language for board games which automatically compiles into hardware-accelerated code. Our framework, Ludax, combines the generality of game description languages with the speed of modern parallel processing hardware and is designed to fit neatly into existing deep learning pipelines. We envision Ludax as a tool to help accelerate games research generally, from RL to cognitive science, by enabling rapid simulation and providing a flexible representation scheme. We present a detailed breakdown of Ludax's description language and technical notes on the compilation process, along with speed benchmarking and a demonstration of training RL agents. The Ludax framework, along with implementations of existing board games, is open-source and freely available.

Ludax: A GPU-Accelerated Domain Specific Language for Board Games

Abstract

Games have long been used as benchmarks and testing environments for research in artificial intelligence. A key step in supporting this research was the development of game description languages: frameworks that compile domain-specific code into playable and simulatable game environments, allowing researchers to generalize their algorithms and approaches across multiple games without having to manually implement each one. More recently, progress in reinforcement learning (RL) has been largely driven by advances in hardware acceleration. Libraries like JAX allow practitioners to take full advantage of cutting-edge computing hardware, often speeding up training and testing by orders of magnitude. Here, we present a synthesis of these strands of research: a domain-specific language for board games which automatically compiles into hardware-accelerated code. Our framework, Ludax, combines the generality of game description languages with the speed of modern parallel processing hardware and is designed to fit neatly into existing deep learning pipelines. We envision Ludax as a tool to help accelerate games research generally, from RL to cognitive science, by enabling rapid simulation and providing a flexible representation scheme. We present a detailed breakdown of Ludax's description language and technical notes on the compilation process, along with speed benchmarking and a demonstration of training RL agents. The Ludax framework, along with implementations of existing board games, is open-source and freely available.

Paper Structure

This paper contains 20 sections, 2 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Natural language description of Reversi along with its corresponding translation into Ludax. Ludax uses "ludemic" syntax that represents high-level game components as separate program sections and aims to be easily interpretable to non-experts.
  • Figure 2: Average throughput (moves per second) on various exemplar games for Ludax, Ludii, and PGX. The first four games are implemented in all three frameworks, while the remaining games are implemented only in Ludax and Ludii. Speeds for Ludax and PGX are reported for 500 episodes of various batch sizes on a workstation with a single NVIDIA 4090 GPU and 36 threads on 18 CPU cores, while speeds for Ludii are the fasted recorded throughput for parallel execution on the same workstation across 1, 16, and 32 threads. Error bars are standard deviations over the 500 episodes.
  • Figure 3: Performance of reinforcement learning agents trained in the Ludax and PGX implementations of Reversi against the PGX baseline agent. On the left, we plot the average winrate of the learned agents against the baseline over time and across three separate runs. On the right, we plot the average and variance of the winrates. Each run took roughly 3 hours to complete on a workstation with a single A100 GPU.
  • Figure 4: Ludax syntax for Reversi, Connect Four (classic board games), Yavalath and Wolf and Sheep.Ludax supports a wide range of games of vary complexity.
  • Figure 5: Ludax rendering of a simple gridworld environment akin to FrozenLake.Ludax's syntax can be adapted to represent single-player games with movement dynamics more typical of simple video games. Player 1 (white) moves a single piece (circle) one step at a time, attempting to avoid the "danger" region in orange and reach the "target" region in green.
  • ...and 1 more figures