The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) Competition
Diego Perez-Liebana, Katja Hofmann, Sharada Prasanna Mohanty, Noboru Kuno, Andre Kramer, Sam Devlin, Raluca D. Gaina, Daniel Ionita
TL;DR
This paper addresses scalability and generalization in multi-agent reinforcement learning under heterogeneous opponents and task variations. It introduces the MARLÖ competition, a Minecraft/Malmo-based benchmark featuring three games—Mob Chase, Build Battle, and Treasure Hunt—with highly parameterizable tasks and a playoff-style evaluation. The contribution includes a public starter kit, baseline agents, and a unified evaluation protocol designed to facilitate cross-game generalization and fair comparisons. The setup aims to propel multi-agent RL toward generalizable policies applicable across diverse environments and partner/opponent configurations.
Abstract
Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types. The Multi-Agent Reinforcement Learning in MalmÖ (MARLÖ) competition is a new challenge that proposes research in this domain using multiple 3D games. The goal of this contest is to foster research in general agents that can learn across different games and opponent types, proposing a challenge as a milestone in the direction of Artificial General Intelligence.
