ORCHID: Fairness-Aware Orchestration in Mission-Critical Air-Ground Integrated Networks
Chuan-Chi Lai, Chi Jai Choy
TL;DR
ORCHID addresses non-stationarity and fairness gaps in multi-UAV orchestration for mission-critical AGINs by combining a GBS-aware initialization with a MAPPO-based Reset-and-Finetune (R&F) mechanism. The framework optimizes a multi-objective reward that balances coverage, energy efficiency, UAV workload, and UE rate fairness under explicit constraints, revealing a counter-intuitive efficiency-fairness synergy where Max-Min Fairness (MMF) can outperform Proportional Fairness (PF) in energy efficiency. Phase I provides a warm start via heterogeneous clustering that aligns UAV deployment with hotspot hotspots, while Phase II uses resilience-enhanced MARL to converge reliably to high-quality policies, particularly after triggering R&F at performance plateaus. Extensive simulations show ORCHID achieves superior Pareto dominance over baselines (e.g., MADDPG), delivering robust connectivity and enhanced energy efficiency in dynamic, disaster-like scenarios. The results advocate MMF-based fairness as a practical, performance-enhancing objective for rapid, equitable, and energy-conscious AGIN deployment in public safety operations.
Abstract
In the era of 6G Air-Ground Integrated Networks (AGINs), Unmanned Aerial Vehicles (UAVs) are pivotal for providing on-demand wireless coverage in mission-critical environments, such as post-disaster rescue operations. However, traditional Deep Reinforcement Learning (DRL) approaches for multi-UAV orchestration often face critical challenges: instability due to the non-stationarity of multi-agent environments and the difficulty of balancing energy efficiency with service equity. To address these issues, this paper proposes ORCHID (Orchestration of Resilient Coverage via Hybrid Intelligent Deployment), a novel stability-enhanced two-stage learning framework. First, ORCHID leverages a GBS-aware topology partitioning strategy to mitigate the exploration cold-start problem. Second, we introduce a Reset-and-Finetune (R\&F) mechanism within the MAPPO architecture that stabilizes the learning process via synchronized learning rate decay and optimizer state resetting. This mechanism effectively suppresses gradient variance to prevent policy degradation, thereby ensuring algorithmic resilience in dynamic environments. Furthermore, we uncover a counter-intuitive efficiency-fairness synergy: contrary to the conventional trade-off, our results demonstrate that the proposed Max-Min Fairness (MMF) design not only guarantees service for cell-edge users but also achieves superior energy efficiency compared to Proportional Fairness (PF), which tends to converge to suboptimal greedy equilibria. Extensive experiments confirm that ORCHID occupies a superior Pareto-dominant position compared to state-of-the-art baselines, ensuring robust convergence and resilient connectivity in mission-critical scenarios.
