Table of Contents
Fetching ...

Enhancing User Throughput in Multi-panel mmWave Radio Access Networks for Beam-based MU-MIMO Using a DRL Method

Ramin Hashemi, Vismika Ranasinghe, Teemu Veijalainen, Petteri Kela, Risto Wichman

TL;DR

A deep reinforcement learning (DRL) approach for enhancing user throughput in multi-panel mmWave radio access networks in a practical network setup by incorporating the cross-correlation between the beams in different antenna panels, the measured reference signal received power (RSRP), and the beam usage statistics to dynamically adjust beamforming decisions.

Abstract

Millimeter-wave (mmWave) communication systems, particularly those leveraging multi-user multiple-input and multiple-output (MU-MIMO) with hybrid beamforming, face challenges in optimizing user throughput and minimizing latency due to the high complexity of dynamic beam selection and management. This paper introduces a deep reinforcement learning (DRL) approach for enhancing user throughput in multi-panel mmWave radio access networks in a practical network setup. Our DRL-based formulation utilizes an adaptive beam management strategy that models the interaction between the communication agent and its environment as a Markov decision process (MDP), optimizing beam selection based on real-time observations. The proposed framework exploits spatial domain (SD) characteristics by incorporating the cross-correlation between the beams in different antenna panels, the measured reference signal received power (RSRP), and the beam usage statistics to dynamically adjust beamforming decisions. As a result, the spectral efficiency is improved and end-to-end latency is reduced. The numerical results demonstrate an increase in throughput of up to 16% and a reduction in latency by factors 3-7x compared to baseline (legacy beam management).

Enhancing User Throughput in Multi-panel mmWave Radio Access Networks for Beam-based MU-MIMO Using a DRL Method

TL;DR

A deep reinforcement learning (DRL) approach for enhancing user throughput in multi-panel mmWave radio access networks in a practical network setup by incorporating the cross-correlation between the beams in different antenna panels, the measured reference signal received power (RSRP), and the beam usage statistics to dynamically adjust beamforming decisions.

Abstract

Millimeter-wave (mmWave) communication systems, particularly those leveraging multi-user multiple-input and multiple-output (MU-MIMO) with hybrid beamforming, face challenges in optimizing user throughput and minimizing latency due to the high complexity of dynamic beam selection and management. This paper introduces a deep reinforcement learning (DRL) approach for enhancing user throughput in multi-panel mmWave radio access networks in a practical network setup. Our DRL-based formulation utilizes an adaptive beam management strategy that models the interaction between the communication agent and its environment as a Markov decision process (MDP), optimizing beam selection based on real-time observations. The proposed framework exploits spatial domain (SD) characteristics by incorporating the cross-correlation between the beams in different antenna panels, the measured reference signal received power (RSRP), and the beam usage statistics to dynamically adjust beamforming decisions. As a result, the spectral efficiency is improved and end-to-end latency is reduced. The numerical results demonstrate an increase in throughput of up to 16% and a reduction in latency by factors 3-7x compared to baseline (legacy beam management).
Paper Structure (8 sections, 9 equations, 6 figures, 1 table)

This paper contains 8 sections, 9 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Beam selection approach in beam-based MU-MIMO with $M_p=4$ antenna panels. The activated beams in each panel are spatially separated from beams within other panels, however, spatially closer beams could have a non-zero cross-correlation.
  • Figure 2: Possible dimensions to be considered in beam-based MU-MIMO beam management procedure apart from traditional RSRP-based legacy approach.
  • Figure 3: Impact of exploration during training on the average MT throughput (total no. of MTs in network $U=210$).
  • Figure 4: The CDF of user throughput ($U=210$).
  • Figure 5: Comparison between geometric mean user throughput, and the end-to-end latency with RL-DDQN vs. baseline.
  • ...and 1 more figures