Scaling Opponent Shaping to High Dimensional Games
Akbir Khan, Timon Willi, Newton Kwan, Andrea Tacchetti, Chris Lu, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster
TL;DR
This paper tackles the challenge of scaling opponent shaping (OS) to high-dimensional, temporally-extended general-sum games. It introduces Shaper, a memory-efficient OS that captures both context and history with a single recurrent agent and employs batched hidden-state averaging to align co-player updates across batches. Through extensive experiments on IPD/IMP in the Matrix and the CoinGame, Shaper demonstrates superior individual and collective outcomes compared to prior OS methods and Naive Learners, highlighting the importance of memory and batch-averaging. The results establish OS as a scalable approach for complex multi-agent settings, while also revealing limitations of existing benchmarks like CoinGame and underscoring ethical considerations for shaping in real-world systems.
Abstract
In multi-agent settings with mixed incentives, methods developed for zero-sum games have been shown to lead to detrimental outcomes. To address this issue, opponent shaping (OS) methods explicitly learn to influence the learning dynamics of co-players and empirically lead to improved individual and collective outcomes. However, OS methods have only been evaluated in low-dimensional environments due to the challenges associated with estimating higher-order derivatives or scaling model-free meta-learning. Alternative methods that scale to more complex settings either converge to undesirable solutions or rely on unrealistic assumptions about the environment or co-players. In this paper, we successfully scale an OS-based approach to general-sum games with temporally-extended actions and long-time horizons for the first time. After analysing the representations of the meta-state and history used by previous algorithms, we propose a simplified version called Shaper. We show empirically that Shaper leads to improved individual and collective outcomes in a range of challenging settings from literature. We further formalize a technique previously implicit in the literature, and analyse its contribution to opponent shaping. We show empirically that this technique is helpful for the functioning of prior methods in certain environments. Lastly, we show that previous environments, such as the CoinGame, are inadequate for analysing temporally-extended general-sum interactions.
