Coordinated Strategies in Realistic Air Combat by Hierarchical Multi-Agent Reinforcement Learning
Ardian Selmonaj, Giacomo Del Rio, Adrian Schneider, Alessandro Antonucci
TL;DR
This work presents a hierarchical multi-agent reinforcement learning framework (HMARL) for realistic 3D air combat using JSBSim physics. By structuring decision making into low-level continuous control policies and high-level commander policies, the approach enables coordinated maneuvers while handling partial observability. It couples curriculum learning and league-play to progressively increase task difficulty and robustness, and introduces a multi-agent adaptation of Simple Policy Optimization (MA-SPO) within a centralized training and decentralized execution paradigm. Experiments across varied team sizes show that hierarchical agents achieve superior combat performance and resilience compared with non-hierarchical baselines, including strong performance in large-scale engagements. The framework advances realism and coordination in air combat MARL and provides a foundation for sim-to-real transfer, human–AI collaboration, and further tactical planning research.
Abstract
Achieving mission objectives in a realistic simulation of aerial combat is highly challenging due to imperfect situational awareness and nonlinear flight dynamics. In this work, we introduce a novel 3D multi-agent air combat environment and a Hierarchical Multi-Agent Reinforcement Learning framework to tackle these challenges. Our approach combines heterogeneous agent dynamics, curriculum learning, league-play, and a newly adapted training algorithm. To this end, the decision-making process is organized into two abstraction levels: low-level policies learn precise control maneuvers, while high-level policies issue tactical commands based on mission objectives. Empirical results show that our hierarchical approach improves both learning efficiency and combat performance in complex dogfight scenarios.
