Table of Contents
Fetching ...

A Deep Reinforcement Learning-based Approach for Adaptive Handover Protocols

Johannes Voigt, Peter Jiacheng Gu, Peter Rost

TL;DR

The paper tackles frequent handovers in dense 5G NR networks at high frequencies by introducing a PPO-based deep reinforcement learning (DRL) agent deployed at base stations to adapt handover timing and decisions. It formulates the problem as a per-BS MDP with a state that includes a one-hot serving cell, a clipped SINR vector, and a ping-pong flag, and uses a reward that balances SINR quality with penalties for PP and RLF, optimizing the average rate $ar{R}$ relative to the ideal $ar{R}_{max}$ via the ratio $ar{R}/ar{R}_{max}$. The approach is validated in a realistic urban environment using the Vienna 5G System Level Simulator and SUMO for mobility, showing that the PPO-based method matches or slightly surpasses the 3GPP handover in average rate while reducing RLF, particularly at higher speeds. The work contributes a robust, adaptable HO framework with open-source code and datasets, signaling a promising direction for mobility management in next-generation HetNets.

Abstract

The use of higher frequencies in mobile communication systems leads to smaller cell sizes, resulting in the deployment of more base stations and an increase in handovers to support user mobility. This can lead to frequent radio link failures and reduced data rates. In this work, we propose a handover optimization method using proximal policy optimization (PPO) to develop an adaptive handover protocol. Our PPO-based agent, implemented in the base stations, is highly adaptive to varying user equipment speeds and outperforms the 3GPP-standardized 5G NR handover procedure in terms of average data rate and radio link failure rate. Additionally, our simulation environment is carefully designed to ensure high accuracy, realistic user movements, and fair benchmarking against the 3GPP handover method.

A Deep Reinforcement Learning-based Approach for Adaptive Handover Protocols

TL;DR

The paper tackles frequent handovers in dense 5G NR networks at high frequencies by introducing a PPO-based deep reinforcement learning (DRL) agent deployed at base stations to adapt handover timing and decisions. It formulates the problem as a per-BS MDP with a state that includes a one-hot serving cell, a clipped SINR vector, and a ping-pong flag, and uses a reward that balances SINR quality with penalties for PP and RLF, optimizing the average rate relative to the ideal via the ratio . The approach is validated in a realistic urban environment using the Vienna 5G System Level Simulator and SUMO for mobility, showing that the PPO-based method matches or slightly surpasses the 3GPP handover in average rate while reducing RLF, particularly at higher speeds. The work contributes a robust, adaptable HO framework with open-source code and datasets, signaling a promising direction for mobility management in next-generation HetNets.

Abstract

The use of higher frequencies in mobile communication systems leads to smaller cell sizes, resulting in the deployment of more base stations and an increase in handovers to support user mobility. This can lead to frequent radio link failures and reduced data rates. In this work, we propose a handover optimization method using proximal policy optimization (PPO) to develop an adaptive handover protocol. Our PPO-based agent, implemented in the base stations, is highly adaptive to varying user equipment speeds and outperforms the 3GPP-standardized 5G NR handover procedure in terms of average data rate and radio link failure rate. Additionally, our simulation environment is carefully designed to ensure high accuracy, realistic user movements, and fair benchmarking against the 3GPP handover method.

Paper Structure

This paper contains 17 sections, 16 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The centralized DRL agent includes the environment and trajectory buffer (orange), a proximal policy optimization actor-critic unit (blue), objective, i. e., loss computation (green), and an optimizer to update the network parameters (red).
  • Figure 2: Simulated region with areas of different speed limits and pedestrian zones. BS are indicated by circles and crosses mark the path of an UE in the color of the serving BS. The axes indicate the distance in meters.
  • Figure 3: HOF triggered after HO preparation, causing a RLF.
  • Figure 4: Achieved relative average rate $\Gamma_\text{R}$ of the 3GPP and PPO-based HO protocol for different UE velocities.
  • Figure 5: ECDF of SINR before (solid) and after (dashed) handover execution under 3GPP and PPO policies. Red lines indicate $Q_\text{out}$ (solid) and $Q_\text{in}$ (dashed) SINR QoS thresholds.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Definition 1: Radio Link Failure
  • Definition 2: Handover Failure
  • Definition 3: Ping-pong Handover