Table of Contents
Fetching ...

AutoBS: Autonomous Base Station Deployment with Reinforcement Learning and Digital Network Twins

Ju-Hyung Lee, Andreas F. Molisch

TL;DR

AutoBS presents a PPO-based deep reinforcement learning framework integrated with a PMNet digital network twin to autonomously deploy base stations in 6G RANs. By modeling deployment as an MDP and using fast pathloss predictions for reward evaluation, AutoBS achieves near-optimal capacity (up to ~95% of exhaustive search) with inference times in milliseconds, enabling real-time, scalable optimization for dense urban networks. The approach supports both static single-BS and asynchronous multi-BS deployment, demonstrating strong performance gains over heuristic baselines and substantial reductions in computation time. This work offers a practical, adaptable framework for large-scale network topology optimization and can extend to related tasks such as mobility management and energy-efficient BS operation.

Abstract

This paper introduces AutoBS, a reinforcement learning (RL)-based framework for optimal base station (BS) deployment in 6G radio access networks (RAN). AutoBS leverages the Proximal Policy Optimization (PPO) algorithm and fast, site-specific pathloss predictions from PMNet-a generative model for digital network twins (DNT). By efficiently learning deployment strategies that balance coverage and capacity, AutoBS achieves about 95% of the capacity of exhaustive search in single BS scenarios (and in 90% for multiple BSs), while cutting inference time from hours to milliseconds, making it highly suitable for real-time applications (e.g., ad-hoc deployments). AutoBS therefore provides a scalable, automated solution for large-scale 6G networks, meeting the demands of dynamic environments with minimal computational overhead.

AutoBS: Autonomous Base Station Deployment with Reinforcement Learning and Digital Network Twins

TL;DR

AutoBS presents a PPO-based deep reinforcement learning framework integrated with a PMNet digital network twin to autonomously deploy base stations in 6G RANs. By modeling deployment as an MDP and using fast pathloss predictions for reward evaluation, AutoBS achieves near-optimal capacity (up to ~95% of exhaustive search) with inference times in milliseconds, enabling real-time, scalable optimization for dense urban networks. The approach supports both static single-BS and asynchronous multi-BS deployment, demonstrating strong performance gains over heuristic baselines and substantial reductions in computation time. This work offers a practical, adaptable framework for large-scale network topology optimization and can extend to related tasks such as mobility management and energy-efficient BS operation.

Abstract

This paper introduces AutoBS, a reinforcement learning (RL)-based framework for optimal base station (BS) deployment in 6G radio access networks (RAN). AutoBS leverages the Proximal Policy Optimization (PPO) algorithm and fast, site-specific pathloss predictions from PMNet-a generative model for digital network twins (DNT). By efficiently learning deployment strategies that balance coverage and capacity, AutoBS achieves about 95% of the capacity of exhaustive search in single BS scenarios (and in 90% for multiple BSs), while cutting inference time from hours to milliseconds, making it highly suitable for real-time applications (e.g., ad-hoc deployments). AutoBS therefore provides a scalable, automated solution for large-scale 6G networks, meeting the demands of dynamic environments with minimal computational overhead.

Paper Structure

This paper contains 19 sections, 10 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Overview of the AutoBS framework, where the DRL agent leverages PMNet’s generative pathloss map (DNT of the site) to evaluate coverage and determine optimal BS locations.
  • Figure 2: Training process for AutoBS framework, with PMNet providing generative pathloss predictions to compute the reward at each step.
  • Figure 3: Coverage comparison for Single (Static) BS deployment. Simulations are performed using WirelessInsite and visualized with SionnaRT. Light green areas indicate regions with higher received signal strength. Fig. \ref{['fig:bld map']} illustrates the building map $\mathcal{S}$ used in each state $s_t$.
  • Figure 4: Convergence behavior for Single (Static) BS deployment for Heuristic, Exhaustive, and AutoBS deployments over $200$ steps.
  • Figure 5: Comparison results for Multi (Asynchronous) BS deployment in terms of coverage using SionnaRT. Light green areas indicate higher received signal strength. Fig. \ref{['fig:bld map2']} shows a building map $\mathcal{S}$ used in each state $s_t$.