Table of Contents
Fetching ...

GNN-Empowered Effective Partial Observation MARL Method for AoI Management in Multi-UAV Network

Yuhao Pan, Xiucheng Wang, Zhiyao Xu, Nan Cheng, Wenchao Xu, Jun-jie Zhang

TL;DR

This work tackles UAV trajectory optimization under partial observability to minimize AoI in multi-UAV networks. It introduces Qedgix, a framework that couples EdgeConv-based Graph Neural Networks with the QMIX cooperative MARL architecture to enable distributed, CTDE-enabled optimization, leveraging permutation invariance for scalable training. The method models the problem as a Dec-POMDP and demonstrates that integrating GNNs with the QMIX mixer accelerates convergence and reduces mean AoI compared to baselines like QMIX and static-graph GNN variants. Experiments across varying network sizes and detection ranges show robust performance improvements, highlighting the practical potential of GNN-enhanced MARL in dynamic UAV networks for data collection tasks.

Abstract

Unmanned Aerial Vehicles (UAVs), due to their low cost and high flexibility, have been widely used in various scenarios to enhance network performance. However, the optimization of UAV trajectories in unknown areas or areas without sufficient prior information, still faces challenges related to poor planning performance and low distributed execution. These challenges arise when UAVs rely solely on their own observation information and the information from other UAVs within their communicable range, without access to global information. To address these challenges, this paper proposes the Qedgix framework, which combines graph neural networks (GNNs) and the QMIX algorithm to achieve distributed optimization of the Age of Information (AoI) for users in unknown scenarios. The framework utilizes GNNs to extract information from UAVs, users within the observable range, and other UAVs within the communicable range, thereby enabling effective UAV trajectory planning. Due to the discretization and temporal features of AoI indicators, the Qedgix framework employs QMIX to optimize distributed partially observable Markov decision processes (Dec-POMDP) based on centralized training and distributed execution (CTDE) with respect to mean AoI values of users. By modeling the UAV network optimization problem in terms of AoI and applying the Kolmogorov-Arnold representation theorem, the Qedgix framework achieves efficient neural network training through parameter sharing based on permutation invariance. Simulation results demonstrate that the proposed algorithm significantly improves convergence speed while reducing the mean AoI values of users. The code is available at https://github.com/UNIC-Lab/Qedgix.

GNN-Empowered Effective Partial Observation MARL Method for AoI Management in Multi-UAV Network

TL;DR

This work tackles UAV trajectory optimization under partial observability to minimize AoI in multi-UAV networks. It introduces Qedgix, a framework that couples EdgeConv-based Graph Neural Networks with the QMIX cooperative MARL architecture to enable distributed, CTDE-enabled optimization, leveraging permutation invariance for scalable training. The method models the problem as a Dec-POMDP and demonstrates that integrating GNNs with the QMIX mixer accelerates convergence and reduces mean AoI compared to baselines like QMIX and static-graph GNN variants. Experiments across varying network sizes and detection ranges show robust performance improvements, highlighting the practical potential of GNN-enhanced MARL in dynamic UAV networks for data collection tasks.

Abstract

Unmanned Aerial Vehicles (UAVs), due to their low cost and high flexibility, have been widely used in various scenarios to enhance network performance. However, the optimization of UAV trajectories in unknown areas or areas without sufficient prior information, still faces challenges related to poor planning performance and low distributed execution. These challenges arise when UAVs rely solely on their own observation information and the information from other UAVs within their communicable range, without access to global information. To address these challenges, this paper proposes the Qedgix framework, which combines graph neural networks (GNNs) and the QMIX algorithm to achieve distributed optimization of the Age of Information (AoI) for users in unknown scenarios. The framework utilizes GNNs to extract information from UAVs, users within the observable range, and other UAVs within the communicable range, thereby enabling effective UAV trajectory planning. Due to the discretization and temporal features of AoI indicators, the Qedgix framework employs QMIX to optimize distributed partially observable Markov decision processes (Dec-POMDP) based on centralized training and distributed execution (CTDE) with respect to mean AoI values of users. By modeling the UAV network optimization problem in terms of AoI and applying the Kolmogorov-Arnold representation theorem, the Qedgix framework achieves efficient neural network training through parameter sharing based on permutation invariance. Simulation results demonstrate that the proposed algorithm significantly improves convergence speed while reducing the mean AoI values of users. The code is available at https://github.com/UNIC-Lab/Qedgix.
Paper Structure (17 sections, 1 theorem, 19 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 17 sections, 1 theorem, 19 equations, 7 figures, 1 table, 1 algorithm.

Key Result

Lemma 1

The Kolmogorov-Arnold representation theorem states that any multivariate function can be represented as a composition of univariate functions and summations

Figures (7)

  • Figure 1: The UAVs collect data from users in remote areas.
  • Figure 2: Illustration of graph models, with the movement of the UAVs, due to the change of the observation range and communication range, the graphical model corresponding to the wireless communication network composed of the UAVs and the users is also changing.
  • Figure 3: Introduction to the Qedgix Algorithm Framework: The framework utilizes a GNN to extract features from UAVs and users, evaluating how various UAV flight directions affect the average AoI through the outputs of corresponding UAV nodes. The Mixer Network is used during training but discarded during inference.
  • Figure 4: Reward comparison for UAVs collecting user data with four different algorithms.
  • Figure 5: Subfigures (a)-(b) show trajectories for the scenario with three UAVs and six users. Subfigures (c)-(d) show trajectories for the scenario with three UAVs and eight users.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Lemma 1