Table of Contents
Fetching ...

QoS-Aware Scheduling in New Radio Using Deep Reinforcement Learning

Jakob Stigenberg, Vidit Saxena, Soma Tayamon, Euhanna Ghadimi

TL;DR

This work tackles QoS-aware time-domain scheduling in 5G NR by introducing QADRA, a deep reinforcement learning-based scheduler that explicitly sorts data flows to optimize QoS satisfaction and network throughput. The method uses a selection-sort-inspired structure with two recurrent encoders and a Q-Network, trained end-to-end under a tunable preference vector that balances QoS and throughput. Empirical results in a full-system NR simulator show substantial throughput gains (around 30%) with VoIP QoS preserved under appropriate ω configurations, while demonstrating the ability to steer the trade-off between reliability of QoS and overall throughput. The approach provides a scalable, configurable framework for NR scheduling that reduces reliance on handcrafted heuristics and supports operator-driven QoS trade-offs in real networks.

Abstract

Fifth-generation (5G) New Radio (NR) cellular networks support a wide range of new services, many of which require an application-specific quality of service (QoS), e.g. in terms of a guaranteed minimum bit-rate or a maximum tolerable delay. Therefore, scheduling multiple parallel data flows, each serving a unique application instance, is bound to become an even more challenging task compared to the previous generations. Leveraging recent advances in deep reinforcement learning, in this paper, we propose a QoS-Aware Deep Reinforcement learning Agent (QADRA) scheduler for NR networks. In contrast to state-of-the-art scheduling heuristics, the QADRA scheduler explicitly optimizes for the QoS satisfaction rate while simultaneously maximizing the network performance. Moreover, we train our algorithm end-to-end on these objectives. We evaluate QADRA in a full scale, near-product, system level NR simulator and demonstrate a significant boost in network performance. In our particular evaluation scenario, the QADRA scheduler improves network throughput by 30% while simultaneously maintaining the QoS satisfaction rate of VoIP users served by the network, compared to state-of-the-art baselines.

QoS-Aware Scheduling in New Radio Using Deep Reinforcement Learning

TL;DR

This work tackles QoS-aware time-domain scheduling in 5G NR by introducing QADRA, a deep reinforcement learning-based scheduler that explicitly sorts data flows to optimize QoS satisfaction and network throughput. The method uses a selection-sort-inspired structure with two recurrent encoders and a Q-Network, trained end-to-end under a tunable preference vector that balances QoS and throughput. Empirical results in a full-system NR simulator show substantial throughput gains (around 30%) with VoIP QoS preserved under appropriate ω configurations, while demonstrating the ability to steer the trade-off between reliability of QoS and overall throughput. The approach provides a scalable, configurable framework for NR scheduling that reduces reliance on handcrafted heuristics and supports operator-driven QoS trade-offs in real networks.

Abstract

Fifth-generation (5G) New Radio (NR) cellular networks support a wide range of new services, many of which require an application-specific quality of service (QoS), e.g. in terms of a guaranteed minimum bit-rate or a maximum tolerable delay. Therefore, scheduling multiple parallel data flows, each serving a unique application instance, is bound to become an even more challenging task compared to the previous generations. Leveraging recent advances in deep reinforcement learning, in this paper, we propose a QoS-Aware Deep Reinforcement learning Agent (QADRA) scheduler for NR networks. In contrast to state-of-the-art scheduling heuristics, the QADRA scheduler explicitly optimizes for the QoS satisfaction rate while simultaneously maximizing the network performance. Moreover, we train our algorithm end-to-end on these objectives. We evaluate QADRA in a full scale, near-product, system level NR simulator and demonstrate a significant boost in network performance. In our particular evaluation scenario, the QADRA scheduler improves network throughput by 30% while simultaneously maintaining the QoS satisfaction rate of VoIP users served by the network, compared to state-of-the-art baselines.

Paper Structure

This paper contains 14 sections, 10 equations, 5 figures, 2 tables, 3 algorithms.

Figures (5)

  • Figure 1: Overview of the scheduling process. Given a set of data flows, the TD scheduler sorts the flows in descending order of priority. The FD scheduler then allocates resources according to the priority assigned to each flow.
  • Figure 2: Block diagram of QoS-aware scheduling algorithm proposed in this paper. The TD scheduling unit is trained with the output of the FD scheduling unit made available through a feedback loop.
  • Figure 3: Overview of network architecture employed by the QADRA scheduler. Two recurrent networks capture the states of the input and output lists, $s$, that are then combined with an action, $a$, and evaluated using the Q-Network.
  • Figure 4: Starvation in practice. One full buffer downlink data flow and a varying number of VoIP users, each served by two data flows. The downlink cell throughput is greatly reduced with increasing number of VoIP data flows.
  • Figure 5: Empricical cumulative distributions of (a) downlink network throughput and (b) VoIP packet delays for the evaluated techniques.