QoS-Aware Scheduling in New Radio Using Deep Reinforcement Learning
Jakob Stigenberg, Vidit Saxena, Soma Tayamon, Euhanna Ghadimi
TL;DR
This work tackles QoS-aware time-domain scheduling in 5G NR by introducing QADRA, a deep reinforcement learning-based scheduler that explicitly sorts data flows to optimize QoS satisfaction and network throughput. The method uses a selection-sort-inspired structure with two recurrent encoders and a Q-Network, trained end-to-end under a tunable preference vector that balances QoS and throughput. Empirical results in a full-system NR simulator show substantial throughput gains (around 30%) with VoIP QoS preserved under appropriate ω configurations, while demonstrating the ability to steer the trade-off between reliability of QoS and overall throughput. The approach provides a scalable, configurable framework for NR scheduling that reduces reliance on handcrafted heuristics and supports operator-driven QoS trade-offs in real networks.
Abstract
Fifth-generation (5G) New Radio (NR) cellular networks support a wide range of new services, many of which require an application-specific quality of service (QoS), e.g. in terms of a guaranteed minimum bit-rate or a maximum tolerable delay. Therefore, scheduling multiple parallel data flows, each serving a unique application instance, is bound to become an even more challenging task compared to the previous generations. Leveraging recent advances in deep reinforcement learning, in this paper, we propose a QoS-Aware Deep Reinforcement learning Agent (QADRA) scheduler for NR networks. In contrast to state-of-the-art scheduling heuristics, the QADRA scheduler explicitly optimizes for the QoS satisfaction rate while simultaneously maximizing the network performance. Moreover, we train our algorithm end-to-end on these objectives. We evaluate QADRA in a full scale, near-product, system level NR simulator and demonstrate a significant boost in network performance. In our particular evaluation scenario, the QADRA scheduler improves network throughput by 30% while simultaneously maintaining the QoS satisfaction rate of VoIP users served by the network, compared to state-of-the-art baselines.
