Approximate State Abstraction for Markov Games
Hiroki Ishibashi, Kenshi Abe, Atsushi Iwasaki
TL;DR
This work tackles state-space explosion in two-player zero-sum Markov games by introducing approximate state abstraction based on minimax values, extending near-state abstraction from single-agent MDPs to TZMGs. It defines an abstract TZMG via state aggregation with a weight function and derives bounds on the duality gap between equilibria in the abstract and ground games, showing that ground-equilibrium performance is preserved up to a factor of ε and γ when aggregation is performed on minimax values. The authors prove key results for the minimax-based aggregation φ^{Q*}, deriving a GAP bound of GAP ≤ 12ε/(1−γ)^3, and validate the approach experimentally on Markov Soccer, demonstrating substantial state-space reduction and near-equilibrium behavior for modest ε while highlighting limitations at larger ε due to potential deadlocks. The paper also outlines extensions to additional similarity criteria (Model, Boltzmann, Multinomial) with corresponding bounds, and discusses future work on larger-scale games and cross-domain transferability of abstractions.
Abstract
This paper introduces state abstraction for two-player zero-sum Markov games (TZMGs), where the payoffs for the two players are determined by the state representing the environment and their respective actions, with state transitions following Markov decision processes. For example, in games like soccer, the value of actions changes according to the state of play, and thus such games should be described as Markov games. In TZMGs, as the number of states increases, computing equilibria becomes more difficult. Therefore, we consider state abstraction, which reduces the number of states by treating multiple different states as a single state. There is a substantial body of research on finding optimal policies for Markov decision processes using state abstraction. However, in the multi-player setting, the game with state abstraction may yield different equilibrium solutions from those of the ground game. To evaluate the equilibrium solutions of the game with state abstraction, we derived bounds on the duality gap, which represents the distance from the equilibrium solutions of the ground game. Finally, we demonstrate our state abstraction with Markov Soccer, compute equilibrium policies, and examine the results.
