A Deep Reinforcement Learning-based Approach for Adaptive Handover Protocols
Johannes Voigt, Peter Jiacheng Gu, Peter Rost
TL;DR
The paper tackles frequent handovers in dense 5G NR networks at high frequencies by introducing a PPO-based deep reinforcement learning (DRL) agent deployed at base stations to adapt handover timing and decisions. It formulates the problem as a per-BS MDP with a state that includes a one-hot serving cell, a clipped SINR vector, and a ping-pong flag, and uses a reward that balances SINR quality with penalties for PP and RLF, optimizing the average rate $ar{R}$ relative to the ideal $ar{R}_{max}$ via the ratio $ar{R}/ar{R}_{max}$. The approach is validated in a realistic urban environment using the Vienna 5G System Level Simulator and SUMO for mobility, showing that the PPO-based method matches or slightly surpasses the 3GPP handover in average rate while reducing RLF, particularly at higher speeds. The work contributes a robust, adaptable HO framework with open-source code and datasets, signaling a promising direction for mobility management in next-generation HetNets.
Abstract
The use of higher frequencies in mobile communication systems leads to smaller cell sizes, resulting in the deployment of more base stations and an increase in handovers to support user mobility. This can lead to frequent radio link failures and reduced data rates. In this work, we propose a handover optimization method using proximal policy optimization (PPO) to develop an adaptive handover protocol. Our PPO-based agent, implemented in the base stations, is highly adaptive to varying user equipment speeds and outperforms the 3GPP-standardized 5G NR handover procedure in terms of average data rate and radio link failure rate. Additionally, our simulation environment is carefully designed to ensure high accuracy, realistic user movements, and fair benchmarking against the 3GPP handover method.
