Multi-Agent DRL for Queue-Aware Task Offloading in Hierarchical MEC-Enabled Air-Ground Networks
Muhammet Hevesli, Abegaz Mohammed Seid, Aiman Erbad, Mohamed Abdallah
TL;DR
Addresses energy minimization in multi-tier MAGIN by jointly optimizing offloading decisions, UAV trajectories, and edge computing resources using a heterogeneous multi-agent MDP. Proposes MAPPO-BD, a MAPPO-based framework that employs a Beta-distributed actor to handle bounded actions in a CTDE setting with IoTD, UAV, and HAPS agents. Demonstrates superior energy savings and queue-management performance over MADDPG, MAPPO-ND, and PO-MAPPO-BD across extensive simulations, highlighting scalability to multi-UAV/HAPS topologies and robustness to dynamic task profiles. This work offers a practical, scalable approach for real-time, queue-aware MEC in 6G air-ground networks, enabling efficient resource coordination in complex hierarchical edge computing environments.
Abstract
Mobile edge computing (MEC)-enabled air-ground networks are a key component of 6G, employing aerial base stations (ABSs) such as unmanned aerial vehicles (UAVs) and high-altitude platform stations (HAPS) to provide dynamic services to ground IoT devices (IoTDs). These IoTDs support real-time applications (e.g., multimedia and Metaverse services) that demand high computational resources and strict quality of service (QoS) guarantees in terms of latency and task queue management. Given their limited energy and processing capabilities, IoTDs rely on UAVs and HAPS to offload tasks for distributed processing, forming a multi-tier MEC system. This paper tackles the overall energy minimization problem in MEC-enabled air-ground integrated networks (MAGIN) by jointly optimizing UAV trajectories, computing resource allocation, and queue-aware task offloading decisions. The optimization is challenging due to the nonconvex, nonlinear nature of this hierarchical system, which renders traditional methods ineffective. We reformulate the problem as a multi-agent Markov decision process (MDP) with continuous action spaces and heterogeneous agents, and propose a novel variant of multi-agent proximal policy optimization with a Beta distribution (MAPPO-BD) to solve it. Extensive simulations show that MAPPO-BD outperforms baseline schemes, achieving superior energy savings and efficient resource management in MAGIN while meeting queue delay and edge computing constraints.
