Resilient Topology-Aware Coordination for Dynamic 3D UAV Networks under Node Failure

Chuan-Chi Lai

Resilient Topology-Aware Coordination for Dynamic 3D UAV Networks under Node Failure

Chuan-Chi Lai

TL;DR

The paper tackles resilience of dynamic 3D aerial-ground networks under sudden UAV node failures and proposes TAG-MAPPO, a topology-aware graph-based MARL framework with a TA-GAT critic and Random Observation Shuffling to enable autonomous topological reconfiguration. It demonstrates that topology-aware coordination reduces signaling overhead, enables rapid self-healing with over 90% restoration of pre-failure coverage within 15 time steps, and improves fairness in dense urban deployments. The approach achieves faster convergence and higher energy efficiency than MLP-based MAPPO and QMIX baselines across urban, suburban, and rural scenarios, highlighting the value of graph-based relational reasoning for volatile network topologies. The results indicate that incorporating topology intelligence is essential for robust, scalable 6G aerial networks and provides a foundation for adaptive deployments in highly dynamic environments.

Abstract

In 3D Aerial-Ground Integrated Networks (AGINs), ensuring continuous service coverage under unexpected hardware failures is critical for mission-critical applications. While Multi-Agent Reinforcement Learning (MARL) has shown promise in autonomous coordination, its resilience under sudden node failures remains a challenge due to dynamic topology deformation. This paper proposes a Topology-Aware Graph MAPPO (TAG-MAPPO) framework designed to enhance system survivability through autonomous 3D spatial reconfiguration. Our framework incorporates graph-based feature aggregation with a residual ego-state fusion mechanism to capture intricate inter-agent dependencies. This architecture enables the surviving swarm to rapidly adapt its topology compared to conventional Multi-Layer Perceptron (MLP) based approaches. Extensive simulations across heterogeneous environments, ranging from interference-limited Crowded Urban to sparse Rural areas, validate the proposed approach. The results demonstrate that TAG-MAPPO consistently outperforms baselines in both stability and efficiency; specifically, it reduces redundant handoffs by up to 50 percent while maintaining a lead in energy efficiency. Most notably, the framework exhibits exceptional self-healing capabilities following a catastrophic node failure. TAG-MAPPO restores over 90 percent of the pre-failure service coverage within 15 time steps, exhibiting a significantly faster V-shaped recovery trajectory than MLP baselines. Furthermore, in dense urban scenarios, the framework achieves a post-failure Jain's Fairness Index that even surpasses its original four-UAV configuration by effectively resolving service overlaps. These findings suggest that topology-aware coordination is essential for the realization of resilient 6G aerial networks and provides a robust foundation for adaptive deployments in volatile environments.

Resilient Topology-Aware Coordination for Dynamic 3D UAV Networks under Node Failure

TL;DR

Abstract

Paper Structure (61 sections, 24 equations, 6 figures, 1 table)

This paper contains 61 sections, 24 equations, 6 figures, 1 table.

Introduction
Related Work
Multi-UAV Control and Coverage Optimization
Network Resilience and Fault Tolerance
Attention Mechanisms and Reconfigurable MARL
System Model
3D Aerial-Ground Network Architecture
Scenario Description
Physical Configuration and Survivability
Inter-UAV Coordination and Control Plane
Wireless Backhaul (Data Plane)
User Distribution and Mobility Models
Initial Spatial Distribution
Mobility Dynamics
Channel Propagation and Interference Models
...and 46 more sections

Figures (6)

Figure 1: Illustration of the dynamic 3D aerial-ground integrated network scenario. A fleet of UAVs functions as flying base stations to assist the Macro GBS in serving high-mobility vehicular platoons at a traffic intersection. The red solid arrows represent the high-capacity wireless backhaul links connected to the GBS (Data Plane), forming a star topology. The blue dotted lines denote the inter-UAV coordination links for exchanging local state information (Control Plane). The dashed ovals indicate the directional service footprints dynamically tracking the moving user clusters.
Figure 2: The schematic architecture of the proposed TAG-MAPPO framework: (1) Decentralized Execution (Actor): Each UAV agent utilizes a lightweight, shared MLP encoder to map local observations $o_k(t)$ to actions, ensuring real-time responsiveness. (2) Centralized Training (Critic): The critic leverages a Topology-Aware Graph Attention (TA-GAT) mechanism to estimate state values $V_{\phi}(s_t)$. This module employs a decoupled dual-path strategy. A Random Observation Shuffling (ROS) operator is applied to the neighbor feature set $\{\mathbf{h}_j\}_{j \in \mathcal{N}_k(t)}$ to ensure permutation invariance. Simultaneously, a bold red skip connection preserves the ego-state information $\mathbf{h}_k$ to maintain operational consistency. This architecture enables the framework to perform precise relational reasoning and rapid spatial reconfiguration, which ensures robust coordination even under sudden node failures.
Figure 3: Training Procedure for TAG-MAPPO with ROS
Figure 4: Training convergence analysis across heterogeneous scenarios. The top row, including Figs. \ref{['fig:convergence_qos:reward_urban']}--\ref{['fig:convergence_qos:reward_rural']}, illustrates the Average Episode Reward, while the bottom row, Figs. \ref{['fig:convergence_qos:cov_urban']}--\ref{['fig:convergence_qos:cov_rural']}, depicts the evolution of the Coverage Ratio ($C_{\text{cov}}$). Results are averaged over 5 independent runs, with shaded areas representing the 95% confidence interval, demonstrating the framework's stability under stochastic mobility.
Figure 5: Stability and Energy Efficiency Analysis: The top row, including Figs. \ref{['fig:stability_ee:handoff_urban']}--\ref{['fig:stability_ee:handoff_rural']}, illustrates the Total Handoffs ($C_{\text{HO}}$) across different scenarios, while the bottom row, Figs. \ref{['fig:stability_ee:EE_urban']}--\ref{['fig:stability_ee:EE_rural']}, depicts the Energy Efficiency ($E_{\text{eff}}$). Results are averaged over 5 independent runs, with shaded areas representing the 95% confidence interval, demonstrating the framework's superior stability and resource efficiency under stochastic mobility.
...and 1 more figures

Resilient Topology-Aware Coordination for Dynamic 3D UAV Networks under Node Failure

TL;DR

Abstract

Resilient Topology-Aware Coordination for Dynamic 3D UAV Networks under Node Failure

Authors

TL;DR

Abstract

Table of Contents

Figures (6)