Table of Contents
Fetching ...

MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement Learning

Florian Felten, Umut Ucak, Hicham Azmani, Gao Peng, Willem Röpke, Hendrik Baier, Patrick Mannion, Diederik M. Roijers, Jordan K. Terry, El-Ghazali Talbi, Grégoire Danoy, Ann Nowé, Roxana Rădulescu

TL;DR

MOMAland is introduced, the first collection of standardised environments for multi-objective multi-agent reinforcement learning, offering over 10 diverse environments that vary in the number of agents, state representations, reward structures, and utility considerations.

Abstract

Many challenging tasks such as managing traffic systems, electricity grids, or supply chains involve complex decision-making processes that must balance multiple conflicting objectives and coordinate the actions of various independent decision-makers (DMs). One perspective for formalising and addressing such tasks is multi-objective multi-agent reinforcement learning (MOMARL). MOMARL broadens reinforcement learning (RL) to problems with multiple agents each needing to consider multiple objectives in their learning process. In reinforcement learning research, benchmarks are crucial in facilitating progress, evaluation, and reproducibility. The significance of benchmarks is underscored by the existence of numerous benchmark frameworks developed for various RL paradigms, including single-agent RL (e.g., Gymnasium), multi-agent RL (e.g., PettingZoo), and single-agent multi-objective RL (e.g., MO-Gymnasium). To support the advancement of the MOMARL field, we introduce MOMAland, the first collection of standardised environments for multi-objective multi-agent reinforcement learning. MOMAland addresses the need for comprehensive benchmarking in this emerging field, offering over 10 diverse environments that vary in the number of agents, state representations, reward structures, and utility considerations. To provide strong baselines for future research, MOMAland also includes algorithms capable of learning policies in such settings.

MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement Learning

TL;DR

MOMAland is introduced, the first collection of standardised environments for multi-objective multi-agent reinforcement learning, offering over 10 diverse environments that vary in the number of agents, state representations, reward structures, and utility considerations.

Abstract

Many challenging tasks such as managing traffic systems, electricity grids, or supply chains involve complex decision-making processes that must balance multiple conflicting objectives and coordinate the actions of various independent decision-makers (DMs). One perspective for formalising and addressing such tasks is multi-objective multi-agent reinforcement learning (MOMARL). MOMARL broadens reinforcement learning (RL) to problems with multiple agents each needing to consider multiple objectives in their learning process. In reinforcement learning research, benchmarks are crucial in facilitating progress, evaluation, and reproducibility. The significance of benchmarks is underscored by the existence of numerous benchmark frameworks developed for various RL paradigms, including single-agent RL (e.g., Gymnasium), multi-agent RL (e.g., PettingZoo), and single-agent multi-objective RL (e.g., MO-Gymnasium). To support the advancement of the MOMARL field, we introduce MOMAland, the first collection of standardised environments for multi-objective multi-agent reinforcement learning. MOMAland addresses the need for comprehensive benchmarking in this emerging field, offering over 10 diverse environments that vary in the number of agents, state representations, reward structures, and utility considerations. To provide strong baselines for future research, MOMAland also includes algorithms capable of learning policies in such settings.
Paper Structure (54 sections, 21 equations, 16 figures, 5 tables, 1 algorithm)

This paper contains 54 sections, 21 equations, 16 figures, 5 tables, 1 algorithm.

Figures (16)

  • Figure 1: Overview of the libraries related to MOMAland within the Farama Foundation.
  • Figure 2: Multi-objective multi-agent decision-making models characterised along three axes: (i) observability; (ii) cooperativeness; (iii) statefulness radulescu2020multi.
  • Figure 3: Pareto Front and resulting trade-offs learned on a CrazyRL environment (introduced below).
  • Figure 4: Visualization of some environments in MOMAland. From left to right: MO-Connect4, CrazyRL/Surround, MO-MultiWalker-Stability, MO-ItemGathering.
  • Figure 5: Average and 95% confidence intervals of multi-objective performance indicators on training results from MOMAPPO with 20 uniform weights on mo-multiwalker-stability-v0. The Pareto Front plot has been extracted from the run with the largest hypervolume.
  • ...and 11 more figures

Theorems & Definitions (7)

  • Definition 1: Multi-objective partially observable stochastic game
  • Definition 2: Pareto dominance
  • Definition 3: Pareto dominance for team reward
  • Definition 4: Pareto set for team reward
  • Definition 5: Pareto-Nash dominance
  • Definition 6: Pareto-Nash set
  • Definition 7: Nash equilibrium