Digital Twin Assisted Deep Reinforcement Learning for Online Admission Control in Sliced Network
Zhenyu Tao, Wei Xu, Xiaohu You
TL;DR
The proposed DRL solution facilitates the stability of the online DRL and accelerates the convergence, yielding a resource utilization improvement of up to 26.39% compared to the state-of-the-art DRL model, while maintaining consistent performance with the online DRL method in terms of long-term revenues.
Abstract
The proliferation of diverse wireless services in 5G and beyond has led to the emergence of network slicing technologies. Among these, admission control plays a crucial role in achieving service-oriented optimization goals through the selective acceptance of service requests. Although deep reinforcement learning (DRL) forms the foundation in many admission control approaches thanks to its effectiveness and flexibility, initial instability with excessive convergence delay of DRL models hinders their deployment in real-world networks. We propose a digital twin (DT) accelerated DRL solution to address this issue. Specifically, we first formulate the admission decision-making process as a semi-Markov decision process, which is subsequently simplified into an equivalent discrete-time Markov decision process to facilitate the implementation of DRL methods. A neural network-based DT is established with a customized output layer for queuing systems, trained through supervised learning, and then employed to assist the training phase of the DRL model. Extensive simulations show that the DT-accelerated DRL improves resource utilization by over 40% compared to the directly trained state-of-the-art dueling deep Q-learning model. This improvement is achieved while preserving the model's capability to optimize the long-term rewards of the admission process.
