Table of Contents
Fetching ...

Optimizing Wireless Discontinuous Reception via MAC Signaling Learning

Adriano Pastore, Adrián Agustín de Dios, Álvaro Valcarce

TL;DR

This work introduces a reinforcement learning framework to optimize DRX in 5G NR networks by timing MAC CE signaling rather than tuning timers. By formulating DRX signaling as a protocol-learning problem and employing a DQN-based agent, the approach achieves substantial energy savings while preserving latency targets for XR-like traffic, leveraging both Rel-17 compliant and beyond signaling options. Key contributions include defining a rich per-UE and cell-wide state space, a reward that balances idle time and latency satisfaction, and comparative results showing near-halving of active time for single UEs and around 20% reductions for multiple UEs. The findings demonstrate the practical viability of automated, fine-grained DRX control via low-layer signaling, with potential implications for energy efficiency in future wireless networks and cross-UE optimization in more complex scheduling environments.

Abstract

We present a Reinforcement Learning (RL) approach to the problem of controlling the Discontinuous Reception (DRX) policy from a Base Transceiver Station (BTS) in a cellular network. We do so by means of optimally timing the transmission of fast Layer-2 signaling messages (a.k.a. Medium Access Layer (MAC) Control Elements (CEs) as specified in 5G New Radio). Unlike more conventional approaches to DRX optimization, which rely on fine-tuning the values of DRX timers, we assess the gains that can be obtained solely by means of this MAC CE signalling. For the simulation part, we concentrate on traffic types typically encountered in Extended Reality (XR) applications, where the need for battery drain minimization and overheating mitigation are particularly pressing. Both 3GPP 5G New Radio (5G NR) compliant and non-compliant ("beyond 5G") MAC CEs are considered. Our simulation results show that our proposed technique strikes an improved trade-off between latency and energy savings as compared to conventional timer-based approaches that are characteristic of most current implementations. Specifically, our RL-based policy can nearly halve the active time for a single User Equipment (UE) with respect to a naïve MAC CE transmission policy, and still achieve near 20% active time reduction for 9 simultaneously served UEs.

Optimizing Wireless Discontinuous Reception via MAC Signaling Learning

TL;DR

This work introduces a reinforcement learning framework to optimize DRX in 5G NR networks by timing MAC CE signaling rather than tuning timers. By formulating DRX signaling as a protocol-learning problem and employing a DQN-based agent, the approach achieves substantial energy savings while preserving latency targets for XR-like traffic, leveraging both Rel-17 compliant and beyond signaling options. Key contributions include defining a rich per-UE and cell-wide state space, a reward that balances idle time and latency satisfaction, and comparative results showing near-halving of active time for single UEs and around 20% reductions for multiple UEs. The findings demonstrate the practical viability of automated, fine-grained DRX control via low-layer signaling, with potential implications for energy efficiency in future wireless networks and cross-UE optimization in more complex scheduling environments.

Abstract

We present a Reinforcement Learning (RL) approach to the problem of controlling the Discontinuous Reception (DRX) policy from a Base Transceiver Station (BTS) in a cellular network. We do so by means of optimally timing the transmission of fast Layer-2 signaling messages (a.k.a. Medium Access Layer (MAC) Control Elements (CEs) as specified in 5G New Radio). Unlike more conventional approaches to DRX optimization, which rely on fine-tuning the values of DRX timers, we assess the gains that can be obtained solely by means of this MAC CE signalling. For the simulation part, we concentrate on traffic types typically encountered in Extended Reality (XR) applications, where the need for battery drain minimization and overheating mitigation are particularly pressing. Both 3GPP 5G New Radio (5G NR) compliant and non-compliant ("beyond 5G") MAC CEs are considered. Our simulation results show that our proposed technique strikes an improved trade-off between latency and energy savings as compared to conventional timer-based approaches that are characteristic of most current implementations. Specifically, our RL-based policy can nearly halve the active time for a single User Equipment (UE) with respect to a naïve MAC CE transmission policy, and still achieve near 20% active time reduction for 9 simultaneously served UEs.
Paper Structure (15 sections, 6 equations, 10 figures, 3 tables)

This paper contains 15 sections, 6 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Wireless network architecture.
  • Figure 2: Power consumption profile over time for a operated with timer-based .
  • Figure 3: Illustration of the generation. In this example, the queue contents are collections of that have arrived at $t_1$, $t_2$ and $t_3$, respectively. They get segmented by the layer into a sequence of of different sizes (depending on the channel conditions). Zero padding is applied to fill out the last corresponding to each batch of payload data.
  • Figure 4: Pattern of periodic CSI reporting with interruptions due to the inactive mode.
  • Figure 5: Convergence of the cumulative reward for $|\mathcal{A}|=2$ and a variable number of , averaged over $N_\mathrm{runs}=30$ independent runs of the training. Traffic statistics, system model and parameters are as described in Tables \ref{['table:XR_traffic']}, \ref{['tab:sim_config']}, \ref{['tab:rl_hp']}.
  • ...and 5 more figures