A Framework for Scalable Heterogeneous Multi-Agent Adversarial Reinforcement Learning in IsaacLab
Isaac Peterson, Christopher Allred, Jacob Morrey, Mario Harper
TL;DR
The paper addresses the need for scalable adversarial multi-agent reinforcement learning in high-fidelity robotic simulations with heterogeneous morphologies. It introduces HARL-A, a framework that extends IsaacLab with per-team critics and a HAPPO-based training loop to maintain meaningful value signals in competitive, zero-sum settings, along with a curriculum-learning strategy and zero-buffer observation padding. Through environments like Sumo, Soccer, and 3D Galaga, the authors demonstrate emergent adversarial behaviors, improved win rates, and robust policy learning under both alternating and simultaneous training. The work provides a practical, extensible platform that facilitates robust, morphology-diverse adversarial MARL in embodied robotics, with potential applications in pursuit-evasion, security, and competitive manipulation, and outlines concrete future enhancements for scalability and evaluation.
Abstract
Multi-Agent Reinforcement Learning (MARL) is central to robotic systems cooperating in dynamic environments. While prior work has focused on these collaborative settings, adversarial interactions are equally critical for real-world applications such as pursuit-evasion, security, and competitive manipulation. In this work, we extend the IsaacLab framework to support scalable training of adversarial policies in high-fidelity physics simulations. We introduce a suite of adversarial MARL environments featuring heterogeneous agents with asymmetric goals and capabilities. Our platform integrates a competitive variant of Heterogeneous Agent Reinforcement Learning with Proximal Policy Optimization (HAPPO), enabling efficient training and evaluation under adversarial dynamics. Experiments across several benchmark scenarios demonstrate the framework's ability to model and train robust policies for morphologically diverse multi-agent competition while maintaining high throughput and simulation realism. Code and benchmarks are available at: https://github.com/DIRECTLab/IsaacLab-HARL .
