Egret: Reinforcement Mechanism for Sequential Computation Offloading in Edge Computing
Haosong Peng, Yufeng Zhan, DiHua Zhai, Xiaopu Zhang, Yuanqing Xia
TL;DR
The paper tackles revenue maximization for edge computing service providers under privacy and dynamic client arrivals by formulating a sequential price mechanism (SPM) and introducing Egret, a DRL-based method that learns online visiting order and pricing. Egret uses a PPO-based actor-critic architecture with a price ranking technique and state-space design to operate without exposing client preferences, achieving near-optimal revenue in static settings and strong performance under dynamic arrivals. The work provides a theoretical Oracle solution for the static case and demonstrates through extensive experiments that Egret approaches Oracle performance in SCOM (within $1.29\%$) and outperforms state-of-the-art baselines by substantial margins in DSCOM, while maintaining robustness across varying traces and network conditions. Practical impact includes efficient, privacy-preserving online pricing and task offloading decisions that can adapt to real-world MEC deployments.
Abstract
As an emerging computing paradigm, edge computing offers computing resources closer to the data sources, helping to improve the service quality of many real-time applications. A crucial problem is designing a rational pricing mechanism to maximize the revenue of the edge computing service provider (ECSP). However, prior works have considerable limitations: clients are static and are required to disclose their preferences, which is impractical in reality. However, previous works assume user privacy information to be known or consider the number of users in edge scenarios to be static. To address this issue, we propose a novel sequential computation offloading mechanism, where the ECSP posts prices of computing resources with different configurations to clients in turn. Clients independently choose which computing resources to purchase and how to offload based on their prices. Then Egret, a deep reinforcement learning-based approach that achieves maximum revenue, is proposed. Egret determines the optimal price and visiting orders online without considering clients' preferences. Experimental results show that the revenue of ECSP in Egret is only 1.29\% lower than Oracle and 23.43\% better than the state-of-the-art when the client arrives dynamically.
