Markov Potential Game Construction and Multi-Agent Reinforcement Learning with Applications to Autonomous Driving
Huiwen Yan, Mushuang Liu
TL;DR
The paper tackles the difficulty of achieving NE in general-sum Markov games by introducing Markov Potential Games (MPGs), a class with guaranteed pure NE existence and gradient-play convergence. It provides sufficient conditions on reward design and the MDP for MGs to be MPGs, and shows how a total potential $ abla abla$ can drive gradient ascent to NE. The methodology is applied to autonomous driving at intersections, where a carefully designed reward structure yields a potential function framework that enables robust, safe, and efficient multi-vehicle coordination; results indicate MARL with MPGs outperforms single-agent RL in robustness while maintaining safety across diverse surrounding policies. The work offers a practical, theoretically grounded approach to MARL for autonomous driving and suggests broader applicability to other MAS domains with similar structural properties.
Abstract
Markov games (MGs) provide a mathematical foundation for multi-agent reinforcement learning (MARL), enabling self-interested agents to learn their optimal policies while interacting with others in a shared environment. However, due to the complexities of an MG problem, seeking (Markov perfect) Nash equilibrium (NE) is often very challenging for a general-sum MG. Markov potential games (MPGs), which are a special class of MGs, have appealing properties such as guaranteed existence of pure NEs and guaranteed convergence of gradient play algorithms, thereby leading to desirable properties for many MARL algorithms in their NE-seeking processes. However, the question of how to construct MPGs has remained open. This paper provides sufficient conditions on the reward design and on the Markov decision process (MDP), under which an MG is an MPG. Numerical results on autonomous driving applications are reported.
