Table of Contents
Fetching ...

Resilient Topology-Aware Coordination for Dynamic 3D UAV Networks under Node Failure

Chuan-Chi Lai

TL;DR

The paper tackles resilience of dynamic 3D aerial-ground networks under sudden UAV node failures and proposes TAG-MAPPO, a topology-aware graph-based MARL framework with a TA-GAT critic and Random Observation Shuffling to enable autonomous topological reconfiguration. It demonstrates that topology-aware coordination reduces signaling overhead, enables rapid self-healing with over 90% restoration of pre-failure coverage within 15 time steps, and improves fairness in dense urban deployments. The approach achieves faster convergence and higher energy efficiency than MLP-based MAPPO and QMIX baselines across urban, suburban, and rural scenarios, highlighting the value of graph-based relational reasoning for volatile network topologies. The results indicate that incorporating topology intelligence is essential for robust, scalable 6G aerial networks and provides a foundation for adaptive deployments in highly dynamic environments.

Abstract

In 3D Aerial-Ground Integrated Networks (AGINs), ensuring continuous service coverage under unexpected hardware failures is critical for mission-critical applications. While Multi-Agent Reinforcement Learning (MARL) has shown promise in autonomous coordination, its resilience under sudden node failures remains a challenge due to dynamic topology deformation. This paper proposes a Topology-Aware Graph MAPPO (TAG-MAPPO) framework designed to enhance system survivability through autonomous 3D spatial reconfiguration. Our framework incorporates graph-based feature aggregation with a residual ego-state fusion mechanism to capture intricate inter-agent dependencies. This architecture enables the surviving swarm to rapidly adapt its topology compared to conventional Multi-Layer Perceptron (MLP) based approaches. Extensive simulations across heterogeneous environments, ranging from interference-limited Crowded Urban to sparse Rural areas, validate the proposed approach. The results demonstrate that TAG-MAPPO consistently outperforms baselines in both stability and efficiency; specifically, it reduces redundant handoffs by up to 50 percent while maintaining a lead in energy efficiency. Most notably, the framework exhibits exceptional self-healing capabilities following a catastrophic node failure. TAG-MAPPO restores over 90 percent of the pre-failure service coverage within 15 time steps, exhibiting a significantly faster V-shaped recovery trajectory than MLP baselines. Furthermore, in dense urban scenarios, the framework achieves a post-failure Jain's Fairness Index that even surpasses its original four-UAV configuration by effectively resolving service overlaps. These findings suggest that topology-aware coordination is essential for the realization of resilient 6G aerial networks and provides a robust foundation for adaptive deployments in volatile environments.

Resilient Topology-Aware Coordination for Dynamic 3D UAV Networks under Node Failure

TL;DR

The paper tackles resilience of dynamic 3D aerial-ground networks under sudden UAV node failures and proposes TAG-MAPPO, a topology-aware graph-based MARL framework with a TA-GAT critic and Random Observation Shuffling to enable autonomous topological reconfiguration. It demonstrates that topology-aware coordination reduces signaling overhead, enables rapid self-healing with over 90% restoration of pre-failure coverage within 15 time steps, and improves fairness in dense urban deployments. The approach achieves faster convergence and higher energy efficiency than MLP-based MAPPO and QMIX baselines across urban, suburban, and rural scenarios, highlighting the value of graph-based relational reasoning for volatile network topologies. The results indicate that incorporating topology intelligence is essential for robust, scalable 6G aerial networks and provides a foundation for adaptive deployments in highly dynamic environments.

Abstract

In 3D Aerial-Ground Integrated Networks (AGINs), ensuring continuous service coverage under unexpected hardware failures is critical for mission-critical applications. While Multi-Agent Reinforcement Learning (MARL) has shown promise in autonomous coordination, its resilience under sudden node failures remains a challenge due to dynamic topology deformation. This paper proposes a Topology-Aware Graph MAPPO (TAG-MAPPO) framework designed to enhance system survivability through autonomous 3D spatial reconfiguration. Our framework incorporates graph-based feature aggregation with a residual ego-state fusion mechanism to capture intricate inter-agent dependencies. This architecture enables the surviving swarm to rapidly adapt its topology compared to conventional Multi-Layer Perceptron (MLP) based approaches. Extensive simulations across heterogeneous environments, ranging from interference-limited Crowded Urban to sparse Rural areas, validate the proposed approach. The results demonstrate that TAG-MAPPO consistently outperforms baselines in both stability and efficiency; specifically, it reduces redundant handoffs by up to 50 percent while maintaining a lead in energy efficiency. Most notably, the framework exhibits exceptional self-healing capabilities following a catastrophic node failure. TAG-MAPPO restores over 90 percent of the pre-failure service coverage within 15 time steps, exhibiting a significantly faster V-shaped recovery trajectory than MLP baselines. Furthermore, in dense urban scenarios, the framework achieves a post-failure Jain's Fairness Index that even surpasses its original four-UAV configuration by effectively resolving service overlaps. These findings suggest that topology-aware coordination is essential for the realization of resilient 6G aerial networks and provides a robust foundation for adaptive deployments in volatile environments.
Paper Structure (61 sections, 24 equations, 6 figures, 1 table)

This paper contains 61 sections, 24 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Illustration of the dynamic 3D aerial-ground integrated network scenario. A fleet of UAVs functions as flying base stations to assist the Macro GBS in serving high-mobility vehicular platoons at a traffic intersection. The red solid arrows represent the high-capacity wireless backhaul links connected to the GBS (Data Plane), forming a star topology. The blue dotted lines denote the inter-UAV coordination links for exchanging local state information (Control Plane). The dashed ovals indicate the directional service footprints dynamically tracking the moving user clusters.
  • Figure 2: The schematic architecture of the proposed TAG-MAPPO framework: (1) Decentralized Execution (Actor): Each UAV agent utilizes a lightweight, shared MLP encoder to map local observations $o_k(t)$ to actions, ensuring real-time responsiveness. (2) Centralized Training (Critic): The critic leverages a Topology-Aware Graph Attention (TA-GAT) mechanism to estimate state values $V_{\phi}(s_t)$. This module employs a decoupled dual-path strategy. A Random Observation Shuffling (ROS) operator is applied to the neighbor feature set $\{\mathbf{h}_j\}_{j \in \mathcal{N}_k(t)}$ to ensure permutation invariance. Simultaneously, a bold red skip connection preserves the ego-state information $\mathbf{h}_k$ to maintain operational consistency. This architecture enables the framework to perform precise relational reasoning and rapid spatial reconfiguration, which ensures robust coordination even under sudden node failures.
  • Figure 3: Training Procedure for TAG-MAPPO with ROS
  • Figure 4: Training convergence analysis across heterogeneous scenarios. The top row, including Figs. \ref{['fig:convergence_qos:reward_urban']}--\ref{['fig:convergence_qos:reward_rural']}, illustrates the Average Episode Reward, while the bottom row, Figs. \ref{['fig:convergence_qos:cov_urban']}--\ref{['fig:convergence_qos:cov_rural']}, depicts the evolution of the Coverage Ratio ($C_{\text{cov}}$). Results are averaged over 5 independent runs, with shaded areas representing the 95% confidence interval, demonstrating the framework's stability under stochastic mobility.
  • Figure 5: Stability and Energy Efficiency Analysis: The top row, including Figs. \ref{['fig:stability_ee:handoff_urban']}--\ref{['fig:stability_ee:handoff_rural']}, illustrates the Total Handoffs ($C_{\text{HO}}$) across different scenarios, while the bottom row, Figs. \ref{['fig:stability_ee:EE_urban']}--\ref{['fig:stability_ee:EE_rural']}, depicts the Energy Efficiency ($E_{\text{eff}}$). Results are averaged over 5 independent runs, with shaded areas representing the 95% confidence interval, demonstrating the framework's superior stability and resource efficiency under stochastic mobility.
  • ...and 1 more figures