Advances in Multi-agent Reinforcement Learning: Persistent Autonomy and Robot Learning Lab Report 2024
Reza Azadeh
TL;DR
The paper addresses core MARL challenges in cooperative tasks with constraints, highlighting non-stationarity, dimensionality, and exploration difficulties. It presents three core contributions from the PeARL lab: (i) RA-VDN, a relational-network–augmented CTDE framework that alters team reward contributions to steer coordination without reward sharing, validated in Switch gridworlds and real-robot Turtlebot4 experiments; (ii) Mixed Q-Functionals (MQF), a value-based approach for continuous-action MARL that leverages Q-Functionals to enable parallel action evaluation and demonstrates superior convergence over policy-based baselines across six experiments; and (iii) relational-weight optimization for Multi-Agent Multi-Armed Bandits (MAMAB), which convexly optimizes graph edge weights to speed consensus on arm means, showing clear gains in large constrained teams. The results include both simulated and real-world robotic platforms, such as MaMuJoCo-Ant and Turtlebot4, underscoring improvements in adaptation to malfunctions and in cooperative learning under continuous-action constraints. Collectively, these advances advance persistent autonomy by enabling relationally principled coordination and efficient learning in complex, constrained multi-robot systems.
Abstract
Multi-Agent Reinforcement Learning (MARL) approaches have emerged as popular solutions to address the general challenges of cooperation in multi-agent environments, where the success of achieving shared or individual goals critically depends on the coordination and collaboration between agents. However, existing cooperative MARL methods face several challenges intrinsic to multi-agent systems, such as the curse of dimensionality, non-stationarity, and the need for a global exploration strategy. Moreover, the presence of agents with constraints (e.g., limited battery life, restricted mobility) or distinct roles further exacerbates these challenges. This document provides an overview of recent advances in Multi-Agent Reinforcement Learning (MARL) conducted at the Persistent Autonomy and Robot Learning (PeARL) lab at the University of Massachusetts Lowell. We briefly discuss various research directions and present a selection of approaches proposed in our most recent publications. For each proposed approach, we also highlight potential future directions to further advance the field.
