Table of Contents
Fetching ...

A Deep Reinforcement Learning based Scheduler for IoT Devices in Co-existence with 5G-NR

Shahida Jabeen

TL;DR

The paper tackles the challenge of co-existing NB-IoT/LTE-M with 5G-NR in-band in dense multi-cell networks by proposing a benchmark upper-bound scheduler for joint SC and MCS allocation and a multi-agent DRL framework that uses interference-based actions. It develops three DRL variants (DQN, PGN, DDPGN) for both centralized and edge learning, showing that interference-based scheduling substantially outperforms power-based approaches and that policy-based methods approach centralized performance while minimizing data sharing. The study provides a detailed link-rate model, GP-based convex upper-bound formulation, and extensive simulations under realistic fading, demonstrating strong throughput, fairness, and latency advantages of the proposed approach. The results support the practicality of edge learning for scalable, ICI-aware resource allocation in coexisting 5G-NR IoT deployments, with clear directions for future work in scalability and robustness.

Abstract

Co-existence of 5G New Radio (5G-NR) with IoT devices is considered as a promising technique to enhance the spectral usage and efficiency of future cellular networks. In this paper, a unified framework has been proposed for allocating in-band resource blocks (RBs), i.e., within a multi-cell network, to 5G-NR users in co-existence with NB-IoT and LTE-M devices. First, a benchmark (upper-bound) scheduler has been designed for joint sub-carrier (SC) and modulation and coding scheme (MCS) allocation that maximizes instantaneous throughput and fairness among users/devices, while considering synchronous RB allocation in the neighboring cells. A series of numerical simulations with realistic ICI in an urban scenario have been used to compute benchmark upper-bound solutions for characterizing performance in terms of throughput, fairness, and delay. Next, an edge learning based multi-agent deep reinforcement learning (DRL) framework has been developed for different DRL algorithms, specifically, a policy-based gradient network (PGN), a deep Q-learning based network (DQN), and an actor-critic based deep deterministic policy gradient network (DDPGN). The proposed DRL framework depends on interference allocation, where the actions are based on inter-cell-interference (ICI) instead of power, which can bypass the need for raw data sharing and/or inter-agent communication. The numerical results reveal that the interference allocation based DRL schedulers can significantly outperform their counterparts, where the actions are based on power allocation. Further, the performance of the proposed policy-based edge learning algorithms is close to the centralized ones.

A Deep Reinforcement Learning based Scheduler for IoT Devices in Co-existence with 5G-NR

TL;DR

The paper tackles the challenge of co-existing NB-IoT/LTE-M with 5G-NR in-band in dense multi-cell networks by proposing a benchmark upper-bound scheduler for joint SC and MCS allocation and a multi-agent DRL framework that uses interference-based actions. It develops three DRL variants (DQN, PGN, DDPGN) for both centralized and edge learning, showing that interference-based scheduling substantially outperforms power-based approaches and that policy-based methods approach centralized performance while minimizing data sharing. The study provides a detailed link-rate model, GP-based convex upper-bound formulation, and extensive simulations under realistic fading, demonstrating strong throughput, fairness, and latency advantages of the proposed approach. The results support the practicality of edge learning for scalable, ICI-aware resource allocation in coexisting 5G-NR IoT deployments, with clear directions for future work in scalability and robustness.

Abstract

Co-existence of 5G New Radio (5G-NR) with IoT devices is considered as a promising technique to enhance the spectral usage and efficiency of future cellular networks. In this paper, a unified framework has been proposed for allocating in-band resource blocks (RBs), i.e., within a multi-cell network, to 5G-NR users in co-existence with NB-IoT and LTE-M devices. First, a benchmark (upper-bound) scheduler has been designed for joint sub-carrier (SC) and modulation and coding scheme (MCS) allocation that maximizes instantaneous throughput and fairness among users/devices, while considering synchronous RB allocation in the neighboring cells. A series of numerical simulations with realistic ICI in an urban scenario have been used to compute benchmark upper-bound solutions for characterizing performance in terms of throughput, fairness, and delay. Next, an edge learning based multi-agent deep reinforcement learning (DRL) framework has been developed for different DRL algorithms, specifically, a policy-based gradient network (PGN), a deep Q-learning based network (DQN), and an actor-critic based deep deterministic policy gradient network (DDPGN). The proposed DRL framework depends on interference allocation, where the actions are based on inter-cell-interference (ICI) instead of power, which can bypass the need for raw data sharing and/or inter-agent communication. The numerical results reveal that the interference allocation based DRL schedulers can significantly outperform their counterparts, where the actions are based on power allocation. Further, the performance of the proposed policy-based edge learning algorithms is close to the centralized ones.
Paper Structure (25 sections, 23 equations, 4 figures, 4 tables, 3 algorithms)

This paper contains 25 sections, 23 equations, 4 figures, 4 tables, 3 algorithms.

Figures (4)

  • Figure 1: Shannon's rate function ($s(\gamma)=\log_2(1+\gamma)$) vs. continuous rate function ($g(\gamma)=\gamma^{\log_{10}(e)}$) vs. piece-wise discrete rate function ($f(\gamma)$)
  • Figure 2: Centralized Training and Testing without small scale fading
  • Figure 3: Centralized Training and Testing with small scale fading
  • Figure 4: Edge Learning without small scale fading