Cooperative Open-ended Learning Framework for Zero-shot Coordination
Yang Li, Shao Zhang, Jichen Sun, Yali Du, Ying Wen, Xinbing Wang, Wei Pan
TL;DR
We address zero-shot coordination in two-player cooperative games by reframing tasks with Graphic-Form Games and Preference Graphic-Form Games, and by implementing the Cooperative Open-ended Learning (COLE) framework to identify and overcome cooperative incompatibility. A practical instantiation, COLE_SV, combines a Graphic Shapley Value-based solver with a trainer that optimizes a joint objective balancing individual performance and cooperation over a cooperative-incompatibility distribution. The authors prove convergence to a local best-preferred strategy with a Q-sublinear rate under in-degree centrality and validate performance in Overcooked, where COLE_SV outperforms state-of-the-art baselines against unseen partners. The work advances open-ended MARL with graph-theoretic objective design, enabling more robust zero-shot coordination and shedding light on how to mitigate cooperative incompatibility in cooperative AI.
Abstract
Zero-shot coordination in cooperative artificial intelligence (AI) remains a significant challenge, which means effectively coordinating with a wide range of unseen partners. Previous algorithms have attempted to address this challenge by optimizing fixed objectives within a population to improve strategy or behaviour diversity. However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility. To address this issue, we propose the Cooperative Open-ended LEarning (COLE) framework, which constructs open-ended objectives in cooperative games with two players from the perspective of graph theory to assess and identify the cooperative ability of each strategy. We further specify the framework and propose a practical algorithm that leverages knowledge from game theory and graph theory. Furthermore, an analysis of the learning process of the algorithm shows that it can efficiently overcome cooperative incompatibility. The experimental results in the Overcooked game environment demonstrate that our method outperforms current state-of-the-art methods when coordinating with different-level partners. Our demo is available at https://sites.google.com/view/cole-2023.
