Tree Search for Simultaneous Move Games via Equilibrium Approximation
Ryan Yu, Alex Olshevsky, Peter Chin
TL;DR
The paper tackles learning in simultaneous-move, partial-information multi-agent games by injecting game-theoretic equilibrium reasoning into tree search. It proposes NN-CCE, which uses per-agent policies and MA-EXP-IX to approximate an $\epsilon$-CCE within a structured, time-layered MCTS framework trained via self-play. Across OpenSpiel, Google Football, SMAC, and related benchmarks, NN-CCE outperforms equilibrium-approximation baselines and strong MARL methods, with improved consistency and robustness, albeit at the cost of longer training times. The work advances practical equilibrium-aware planning for discrete-action, multi-agent settings and suggests paths toward continuous-action extensions and broader scalability.
Abstract
Neural network supported tree-search has shown strong results in a variety of perfect information multi-agent tasks. However, the performance of these methods on partial information games has generally been below competing approaches. Here we study the class of simultaneous-move games, which are a subclass of partial information games which are most similar to perfect information games: both agents know the game state with the exception of the opponent's move, which is revealed only after each agent makes its own move. Simultaneous move games include popular benchmarks such as Google Research Football and Starcraft. In this study we answer the question: can we take tree search algorithms trained through self-play from perfect information settings and adapt them to simultaneous move games without significant loss of performance? We answer this question by deriving a practical method that attempts to approximate a coarse correlated equilibrium as a subroutine within a tree search. Our algorithm works on cooperative, competitive, and mixed tasks. Our results are better than the current best MARL algorithms on a wide range of accepted baseline environments.
