Distributed Computation Offloading for Energy Provision Minimization in WP-MEC Networks with Multiple HAPs
Xiaoying Liu, Anping Chen, Kechen Zheng, Kaikai Chi, Bin Yang, Tarik Taleb
TL;DR
The paper tackles energy provisioning in a dynamic WP-MEC network with multiple HAPs and harvest-then-offload operation. It introduces TMADO, a two-stage distributed deep reinforcement learning framework that assigns a high-level DDPG-based policy to HAPs for WPT power and duration and low-level IPPO-based policies to WDs for offloading decisions and local CPU settings. The method decomposes the NP-hard long-term optimization into WCDO, ODO, and RO subproblems, enabling scalable, distributed optimization with CTDE training. Empirical results show TMADO achieves lower HAP energy provision and favorable offloading behavior compared with several baselines, highlighting the framework’s potential for green WP-MEC deployment in networks with multiple HAPs.
Abstract
This paper investigates a wireless powered mobile edge computing (WP-MEC) network with multiple hybrid access points (HAPs) in a dynamic environment, where wireless devices (WDs) harvest energy from radio frequency (RF) signals of HAPs, and then compute their computation data locally (i.e., local computing mode) or offload it to the chosen HAPs (i.e., edge computing mode). In order to pursue a green computing design, we formulate an optimization problem that minimizes the long-term energy provision of the WP-MEC network subject to the energy, computing delay and computation data demand constraints. The transmit power of HAPs, the duration of the wireless power transfer (WPT) phase, the offloading decisions of WDs, the time allocation for offloading and the CPU frequency for local computing are jointly optimized adapting to the time-varying generated computation data and wireless channels of WDs. To efficiently address the formulated non-convex mixed integer programming (MIP) problem in a distributed manner, we propose a Two-stage Multi-Agent deep reinforcement learning-based Distributed computation Offloading (TMADO) framework, which consists of a high-level agent and multiple low-level agents. The high-level agent residing in all HAPs optimizes the transmit power of HAPs and the duration of the WPT phase, while each low-level agent residing in each WD optimizes its offloading decision, time allocation for offloading and CPU frequency for local computing. Simulation results show the superiority of the proposed TMADO framework in terms of the energy provision minimization.
