Table of Contents
Fetching ...

ORCHID: Fairness-Aware Orchestration in Mission-Critical Air-Ground Integrated Networks

Chuan-Chi Lai, Chi Jai Choy

TL;DR

ORCHID addresses non-stationarity and fairness gaps in multi-UAV orchestration for mission-critical AGINs by combining a GBS-aware initialization with a MAPPO-based Reset-and-Finetune (R&F) mechanism. The framework optimizes a multi-objective reward that balances coverage, energy efficiency, UAV workload, and UE rate fairness under explicit constraints, revealing a counter-intuitive efficiency-fairness synergy where Max-Min Fairness (MMF) can outperform Proportional Fairness (PF) in energy efficiency. Phase I provides a warm start via heterogeneous clustering that aligns UAV deployment with hotspot hotspots, while Phase II uses resilience-enhanced MARL to converge reliably to high-quality policies, particularly after triggering R&F at performance plateaus. Extensive simulations show ORCHID achieves superior Pareto dominance over baselines (e.g., MADDPG), delivering robust connectivity and enhanced energy efficiency in dynamic, disaster-like scenarios. The results advocate MMF-based fairness as a practical, performance-enhancing objective for rapid, equitable, and energy-conscious AGIN deployment in public safety operations.

Abstract

In the era of 6G Air-Ground Integrated Networks (AGINs), Unmanned Aerial Vehicles (UAVs) are pivotal for providing on-demand wireless coverage in mission-critical environments, such as post-disaster rescue operations. However, traditional Deep Reinforcement Learning (DRL) approaches for multi-UAV orchestration often face critical challenges: instability due to the non-stationarity of multi-agent environments and the difficulty of balancing energy efficiency with service equity. To address these issues, this paper proposes ORCHID (Orchestration of Resilient Coverage via Hybrid Intelligent Deployment), a novel stability-enhanced two-stage learning framework. First, ORCHID leverages a GBS-aware topology partitioning strategy to mitigate the exploration cold-start problem. Second, we introduce a Reset-and-Finetune (R\&F) mechanism within the MAPPO architecture that stabilizes the learning process via synchronized learning rate decay and optimizer state resetting. This mechanism effectively suppresses gradient variance to prevent policy degradation, thereby ensuring algorithmic resilience in dynamic environments. Furthermore, we uncover a counter-intuitive efficiency-fairness synergy: contrary to the conventional trade-off, our results demonstrate that the proposed Max-Min Fairness (MMF) design not only guarantees service for cell-edge users but also achieves superior energy efficiency compared to Proportional Fairness (PF), which tends to converge to suboptimal greedy equilibria. Extensive experiments confirm that ORCHID occupies a superior Pareto-dominant position compared to state-of-the-art baselines, ensuring robust convergence and resilient connectivity in mission-critical scenarios.

ORCHID: Fairness-Aware Orchestration in Mission-Critical Air-Ground Integrated Networks

TL;DR

ORCHID addresses non-stationarity and fairness gaps in multi-UAV orchestration for mission-critical AGINs by combining a GBS-aware initialization with a MAPPO-based Reset-and-Finetune (R&F) mechanism. The framework optimizes a multi-objective reward that balances coverage, energy efficiency, UAV workload, and UE rate fairness under explicit constraints, revealing a counter-intuitive efficiency-fairness synergy where Max-Min Fairness (MMF) can outperform Proportional Fairness (PF) in energy efficiency. Phase I provides a warm start via heterogeneous clustering that aligns UAV deployment with hotspot hotspots, while Phase II uses resilience-enhanced MARL to converge reliably to high-quality policies, particularly after triggering R&F at performance plateaus. Extensive simulations show ORCHID achieves superior Pareto dominance over baselines (e.g., MADDPG), delivering robust connectivity and enhanced energy efficiency in dynamic, disaster-like scenarios. The results advocate MMF-based fairness as a practical, performance-enhancing objective for rapid, equitable, and energy-conscious AGIN deployment in public safety operations.

Abstract

In the era of 6G Air-Ground Integrated Networks (AGINs), Unmanned Aerial Vehicles (UAVs) are pivotal for providing on-demand wireless coverage in mission-critical environments, such as post-disaster rescue operations. However, traditional Deep Reinforcement Learning (DRL) approaches for multi-UAV orchestration often face critical challenges: instability due to the non-stationarity of multi-agent environments and the difficulty of balancing energy efficiency with service equity. To address these issues, this paper proposes ORCHID (Orchestration of Resilient Coverage via Hybrid Intelligent Deployment), a novel stability-enhanced two-stage learning framework. First, ORCHID leverages a GBS-aware topology partitioning strategy to mitigate the exploration cold-start problem. Second, we introduce a Reset-and-Finetune (R\&F) mechanism within the MAPPO architecture that stabilizes the learning process via synchronized learning rate decay and optimizer state resetting. This mechanism effectively suppresses gradient variance to prevent policy degradation, thereby ensuring algorithmic resilience in dynamic environments. Furthermore, we uncover a counter-intuitive efficiency-fairness synergy: contrary to the conventional trade-off, our results demonstrate that the proposed Max-Min Fairness (MMF) design not only guarantees service for cell-edge users but also achieves superior energy efficiency compared to Proportional Fairness (PF), which tends to converge to suboptimal greedy equilibria. Extensive experiments confirm that ORCHID occupies a superior Pareto-dominant position compared to state-of-the-art baselines, ensuring robust convergence and resilient connectivity in mission-critical scenarios.
Paper Structure (50 sections, 37 equations, 8 figures, 1 table)

This paper contains 50 sections, 37 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: System model of the hybrid terrestrial-aerial wireless network with multiple UAVs assisting a macro GBS to provide on-demand coverage for ground users.
  • Figure 2: The overall architecture of the proposed ORCHID framework. The process is divided into two sequential stages: Phase I (Initialization) and Phase II (Resilient Fine-Tuning). In Phase I, a GBS-aware heterogeneous clustering strategy partitions user locations $\mathcal{X}$ into $N+1$ clusters to define the service scopes for the fixed GBS and $N$ mobile UAVs, providing initial positions for UAV deployment. In Phase II, a Fairness-Aware MARL Optimization is conducted where a MAPPO-based agent dynamically coordinates resource allocation and UAV trajectories. To ensure robustness, a Reset-and-Finetune mechanism monitors the Jain's Fairness Index (JFI) to trigger an optimizer reset (clearing $\mathbf{m}, \mathbf{v}$) and learning rate decay (by factor $\kappa$) whenever a performance plateau is detected.
  • Figure 3: ORCHID: Two-Stage Orchestration Framework
  • Figure 4: Training convergence analysis of the ORCHID framework over 700 episodes. Solid lines and shaded areas denote the mean and $\pm 1$ standard deviation over 5 independent runs. The vertical dashed line at $e=500$ marks the activation of the R&F mechanism, initiating the stability phase.
  • Figure 5: Comparative analysis of convergence performance against baselines over 700 episodes. The proposed ORCHID demonstrates a significant performance leap following the R&F activation at $e=500$, while MADDPG exhibits higher variance and lower steady-state efficiency.
  • ...and 3 more figures