GNN-Empowered Effective Partial Observation MARL Method for AoI Management in Multi-UAV Network
Yuhao Pan, Xiucheng Wang, Zhiyao Xu, Nan Cheng, Wenchao Xu, Jun-jie Zhang
TL;DR
This work tackles UAV trajectory optimization under partial observability to minimize AoI in multi-UAV networks. It introduces Qedgix, a framework that couples EdgeConv-based Graph Neural Networks with the QMIX cooperative MARL architecture to enable distributed, CTDE-enabled optimization, leveraging permutation invariance for scalable training. The method models the problem as a Dec-POMDP and demonstrates that integrating GNNs with the QMIX mixer accelerates convergence and reduces mean AoI compared to baselines like QMIX and static-graph GNN variants. Experiments across varying network sizes and detection ranges show robust performance improvements, highlighting the practical potential of GNN-enhanced MARL in dynamic UAV networks for data collection tasks.
Abstract
Unmanned Aerial Vehicles (UAVs), due to their low cost and high flexibility, have been widely used in various scenarios to enhance network performance. However, the optimization of UAV trajectories in unknown areas or areas without sufficient prior information, still faces challenges related to poor planning performance and low distributed execution. These challenges arise when UAVs rely solely on their own observation information and the information from other UAVs within their communicable range, without access to global information. To address these challenges, this paper proposes the Qedgix framework, which combines graph neural networks (GNNs) and the QMIX algorithm to achieve distributed optimization of the Age of Information (AoI) for users in unknown scenarios. The framework utilizes GNNs to extract information from UAVs, users within the observable range, and other UAVs within the communicable range, thereby enabling effective UAV trajectory planning. Due to the discretization and temporal features of AoI indicators, the Qedgix framework employs QMIX to optimize distributed partially observable Markov decision processes (Dec-POMDP) based on centralized training and distributed execution (CTDE) with respect to mean AoI values of users. By modeling the UAV network optimization problem in terms of AoI and applying the Kolmogorov-Arnold representation theorem, the Qedgix framework achieves efficient neural network training through parameter sharing based on permutation invariance. Simulation results demonstrate that the proposed algorithm significantly improves convergence speed while reducing the mean AoI values of users. The code is available at https://github.com/UNIC-Lab/Qedgix.
