Table of Contents
Fetching ...

Scalable machine learning-based approaches for energy saving in densely deployed Open RAN

Xuanyu Liang, Ahmed Al-Tahmeesschi, Swarna Chetty, Cicek Cavdar, Berk Canberk, Hamed Ahmadi

Abstract

Densely deployed base stations are responsible for the majority of the energy consumed in Radio access network (RAN). While these deployments are crucial to deliver the required data rate in busy hours of the day, the network can save energy by switching some of them to sleep mode and maintain the coverage and quality of service with the other ones. Benefiting from the flexibility provided by the Open RAN in embedding machine learning (ML) in network operations, in this work we propose Deep Reinforcement Learning (DRL)-based energy saving solutions. Firstly we propose 3 different DRL-based methods in the form of xApps which control the Active/Sleep mode of up to 6 radio units (RUs) from Near Real time RAN Intelligent Controller (RIC). We also propose a further scalable federated DRL-based solution with an aggregator as an rApp in None Real time RIC and local agents as xApps. Our simulation results present the convergence of the proposed methods. We also compare the performance of our federated DRL across three layouts spanning 6--24 RUs and 500--1000\,m regions, including a composite multi-region scenario. The results show that our proposed federated TD3 algorithm achieves up to 43.75\% faster convergence, more than 50\% network energy saving and 37. 4\% lower training energy versus centralized baselines, while maintaining the quality of service and improving the robustness of the policy.

Scalable machine learning-based approaches for energy saving in densely deployed Open RAN

Abstract

Densely deployed base stations are responsible for the majority of the energy consumed in Radio access network (RAN). While these deployments are crucial to deliver the required data rate in busy hours of the day, the network can save energy by switching some of them to sleep mode and maintain the coverage and quality of service with the other ones. Benefiting from the flexibility provided by the Open RAN in embedding machine learning (ML) in network operations, in this work we propose Deep Reinforcement Learning (DRL)-based energy saving solutions. Firstly we propose 3 different DRL-based methods in the form of xApps which control the Active/Sleep mode of up to 6 radio units (RUs) from Near Real time RAN Intelligent Controller (RIC). We also propose a further scalable federated DRL-based solution with an aggregator as an rApp in None Real time RIC and local agents as xApps. Our simulation results present the convergence of the proposed methods. We also compare the performance of our federated DRL across three layouts spanning 6--24 RUs and 500--1000\,m regions, including a composite multi-region scenario. The results show that our proposed federated TD3 algorithm achieves up to 43.75\% faster convergence, more than 50\% network energy saving and 37. 4\% lower training energy versus centralized baselines, while maintaining the quality of service and improving the robustness of the policy.

Paper Structure

This paper contains 20 sections, 31 equations, 11 figures, 3 tables, 2 algorithms.

Figures (11)

  • Figure 1: Illustration of the O-RAN architecture incorporating both O-RAN-defined and 3GPP-standard interfaces. Solid lines represent O-RAN interfaces (e.g., E2, A1, O1, Open FH), while dashed lines indicate 3GPP interfaces.The system spans four geographical areas, each with its own O-DU and O-CU components. Near-RT RICs operate on a per-area basis, deploying multiple xApps. A centralized Non-RT RIC performs global policy aggregation and training coordination using A1 interface.
  • Figure 2: Illustrates the workflow of Fed-TD3 across distributed agents and a aggregator. Red solid lines indicate local critic training based on temporal-difference loss; green dashed lines represent actor updates guided by critic gradients and soft update target network; black and purple solid lines denote the periodic aggregation and redistribution of actor and critic parameters via the aggregator. Each agent operates within its own local environment and contributes to a globally coordinated policy through federated learning.
  • Figure 3: The left plot shows a radio map of single 500 m$\times$500 m area used for centralized training and evaluation. The right plot depicts a 1000 m$\times$1000 m area composed of four such subregions, representing a composite environment for centralized training and inference. This comparison setting is used to evaluate the generalization capability of the global model trained via federated reinforcement learning versus a centralized model trained on the combined area. The middle plot presents a single large-scale 1000 m$\times$1000 m environment used to evaluate model performance under centralized DRL training.
  • Figure 4: Illustrates the rewards of TD3, DQNMA and DQNSA model in 500 m$\times$500 m area with 6 RUs and 20 UEs.
  • Figure 5: Illustrates the average energy consumption in 500 m$\times$500 m area among 6 RUs and 10 to 40 UE with DQNMA, DQNSA and TD3 models. The maximum theoretical energy consumption is 126W.
  • ...and 6 more figures