A General Upper Bound for the Runtime of a Coevolutionary Algorithm on Impartial Combinatorial Games
Alistair Benford, Per Kristian Lehre
TL;DR
This work provides the first runtime analysis of a coevolutionary estimation of distribution algorithm (UMDA) on impartial combinatorial games, introducing a game-graph invariant called switchability to characterize learning difficulty. The authors prove a general upper bound: with a suitably large population size μ, the algorithm finds an optimal strategy within a time of order $O(μ r(G))$ with high probability, where $r(G)$ depends on the number of positions $n$, maximum degree $Δ$, and per-vertex switchability $s(v)$ on critical positions $W_G$. A corollary gives a simpler bound in terms of the maximum switchability, enabling polynomial or quasipolynomial runtimes for many games. The paper validates the theory by applying it to Subtraction Nim, Silver Dollar, Turning Turtles, and Chomp, illustrating how the framework supports a broad, theory-driven understanding of CoEA behavior in turn-based impartial games and guiding future runtime analyses in more complex settings.
Abstract
Due to their complex dynamics, combinatorial games are a key test case and application for algorithms that train game playing agents. Among those algorithms that train using self-play are coevolutionary algorithms (CoEAs). However, the successful application of CoEAs for game playing is difficult due to pathological behaviours such as cycling, an issue especially critical for games with intransitive payoff landscapes. Insight into how to design CoEAs to avoid such behaviours can be provided by runtime analysis. In this paper, we push the scope of runtime analysis for CoEAs to combinatorial games, proving a general upper bound for the number of simulated games needed for UMDA to discover (with high probability) an optimal strategy. This result applies to any impartial combinatorial game, and for many games the implied bound is polynomial or quasipolynomial as a function of the number of game positions. After proving the main result, we provide several applications to simple well-known games: Nim, Chomp, Silver Dollar, and Turning Turtles. As the first runtime analysis for CoEAs on combinatorial games, this result is a critical step towards a comprehensive theoretical framework for coevolution.
