Cooperative Task Offloading through Asynchronous Deep Reinforcement Learning in Mobile Edge Computing for Future Networks
Yuelin Liu, Haiyuan Li, Xenofon Vasilakos, Rasheed Hussain, Dimitra Simeonidou
TL;DR
This work tackles latency and energy minimization for cooperative task offloading in Mobile Edge Computing under asynchronous task arrivals. It proposes CTO-TP, a Transformer-driven prediction framework combined with asynchronous multi-agent DRL (MADQN for discrete server selection and MADDPG for continuous offloading and resource allocation), integrated in a multi-agent MDP. A transformer model predicts future task arrival times and resource demands, feeding $Q_{Mt}$ into the global state to improve long-horizon decisions, with the objective $\min \sum_{t=1}^{T} (\lambda L_{total} + \rho E_{total})$ and $\lambda + \rho = 1$. Experiments on a three-MEC network using Google Cluster Traces show CTO-TP achieving substantial gains in latency and energy over baselines, validating the efficacy of edge-edge cooperation and asynchronous training.
Abstract
Future networks (including 6G) are poised to accelerate the realisation of Internet of Everything. However, it will result in a high demand for computing resources to support new services. Mobile Edge Computing (MEC) is a promising solution, enabling to offload computation-intensive tasks to nearby edge servers from the end-user devices, thereby reducing latency and energy consumption. However, relying solely on a single MEC server for task offloading can lead to uneven resource utilisation and suboptimal performance in complex scenarios. Additionally, traditional task offloading strategies specialise in centralised policy decisions, which unavoidably entail extreme transmission latency and reach computational bottleneck. To fill the gaps, we propose a latency and energy efficient Cooperative Task Offloading framework with Transformer-driven Prediction (CTO-TP), leveraging asynchronous multi-agent deep reinforcement learning to address these challenges. This approach fosters edge-edge cooperation and decreases the synchronous waiting time by performing asynchronous training, optimising task offloading, and resource allocation across distributed networks. The performance evaluation demonstrates that the proposed CTO-TP algorithm reduces up to 80% overall system latency and 87% energy consumption compared to the baseline schemes.
