Beyond the Edge: An Advanced Exploration of Reinforcement Learning for Mobile Edge Computing, its Applications, and Future Research Trajectories
Ning Yang, Shuo Chen, Haijun Zhang, Randall Berry
TL;DR
This survey maps Reinforcement Learning (RL) methods onto Mobile Edge Computing (MEC) to tackle end-to-end optimization in offloading, caching, and communication. It provides a structured taxonomy of RL algorithms (MDP, POMDP, MAB, MARL) and frames SARL vs MARL, including CTDE, for MEC problems, then details RL solutions across offloading, caching, and wireless resources. The paper highlights key applications in Industry 4.0, autonomous driving, robotics, VR/AR, and healthcare, and discusses critical challenges such as latency, data rate, mobility, security, and privacy. It also proposes future directions in software/hardware platforms, representation, robustness, safe RL, large-scale scheduling, generalization, and sim-to-real bridging to accelerate deployment of RL-enabled MEC systems.
Abstract
Mobile Edge Computing (MEC) broadens the scope of computation and storage beyond the central network, incorporating edge nodes close to end devices. This expansion facilitates the implementation of large-scale "connected things" within edge networks. The advent of applications necessitating real-time, high-quality service presents several challenges, such as low latency, high data rate, reliability, efficiency, and security, all of which demand resolution. The incorporation of reinforcement learning (RL) methodologies within MEC networks promotes a deeper understanding of mobile user behaviors and network dynamics, thereby optimizing resource use in computing and communication processes. This paper offers an exhaustive survey of RL applications in MEC networks, initially presenting an overview of RL from its fundamental principles to the latest advanced frameworks. Furthermore, it outlines various RL strategies employed in offloading, caching, and communication within MEC networks. Finally, it explores open issues linked with software and hardware platforms, representation, RL robustness, safe RL, large-scale scheduling, generalization, security, and privacy. The paper proposes specific RL techniques to mitigate these issues and provides insights into their practical applications.
