Deep progressive reinforcement learning-based flexible resource scheduling framework for IRS and UAV-assisted MEC system
Li Dong, Feibo Jiang, Minjie Wang, Yubo Peng, Xiaolong Li
TL;DR
This work tackles energy minimization in IRS- and UAV-assisted MEC under a dynamic number of UAVs by jointly optimizing UAV locations, IRS phase shifts, offloading decisions, and resource allocation. It introduces the Flexible REsource Scheduling (FRES) framework, built on a deep progressive reinforcement learning backbone with a novel multi-task agent, a progressive scheduler to handle changing UAV counts, and a light taboo search to strengthen exploration. Key contributions include a two-head multi-task agent for MINLP tasks, a progressive neural architecture that adapts online to topology changes, and an LTS-driven action refinement, all validated through comprehensive experiments showing real-time, adaptive scheduling and energy savings. The approach enables robust, scalable MEC scheduling in temporary or emergency deployments, where network geometry and device counts vary rapidly.
Abstract
The intelligent reflection surface (IRS) and unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) system is widely used in temporary and emergency scenarios. Our goal is to minimize the energy consumption of the MEC system by jointly optimizing UAV locations, IRS phase shift, task offloading, and resource allocation with a variable number of UAVs. To this end, we propose a Flexible REsource Scheduling (FRES) framework by employing a novel deep progressive reinforcement learning which includes the following innovations: Firstly, a novel multi-task agent is presented to deal with the mixed integer nonlinear programming (MINLP) problem. The multi-task agent has two output heads designed for different tasks, in which a classified head is employed to make offloading decisions with integer variables while a fitting head is applied to solve resource allocation with continuous variables. Secondly, a progressive scheduler is introduced to adapt the agent to the varying number of UAVs by progressively adjusting a part of neurons in the agent. This structure can naturally accumulate experiences and be immune to catastrophic forgetting. Finally, a light taboo search (LTS) is introduced to enhance the global search of the FRES. The numerical results demonstrate the superiority of the FRES framework which can make real-time and optimal resource scheduling even in dynamic MEC systems.
