Table of Contents
Fetching ...

PPO-EPO: Energy and Performance Optimization for O-RAN Using Reinforcement Learning

Rawlings Ntassah, Gian Michele Dell'Aera, Fabrizio Granelli

TL;DR

Open RAN energy efficiency is addressed with a PPO-based reinforcement learning framework for traffic steering and selective cell shutdown, incorporating throughput, interference, and PRB constraints. The objective combines throughput and energy gains via $ \max_{x_k} \sum_{k=1}^K x_k (\omega_{\text{perf}} G_{\text{perf,k}} + \omega_{\text{power}} P_{\text{gain,k}})$ with $EE_{\text{total}} = \frac{R_{\text{avrg}}}{P_{\text{avrg}}}$, and constraints ensure post-shutdown performance and resource limits. Training on a Turin-inspired CU–DU–RU topology with 12 RUs and 5 million PPO episodes, followed by validation on real data from the Viavi RIC Tester, demonstrates that PPO outperforms SARSA and Random in delivering higher downlink throughput and reduced energy consumption while honoring interference thresholds. The work underscores the practicality of RL for energy-aware O-RAN management and points to future avenues like Hybrid and Hierarchical RL for more robust, predictive traffic control.

Abstract

Energy consumption in mobile communication networks has become a significant challenge due to its direct impact on Capital Expenditure (CAPEX) and Operational Expenditure (OPEX). The introduction of Open RAN (O-RAN) enables telecommunication providers to leverage network intelligence to optimize energy efficiency while maintaining Quality of Service (QoS). One promising approach involves traffic-aware cell shutdown strategies, where underutilized cells are selectively deactivated without compromising overall network performance. However, achieving this balance requires precise traffic steering mechanisms that account for throughput performance, power efficiency, and network interference constraints. This work proposes a reinforcement learning (RL) model based on the Proximal Policy Optimization (PPO) algorithm to optimize traffic steering and energy efficiency. The objective is to maximize energy efficiency and performance gains while strategically shutting down underutilized cells. The proposed RL model learns adaptive policies to make optimal shutdown decisions by considering throughput degradation constraints, interference thresholds, and PRB utilization balance. Experimental validation using TeraVM Viavi RIC tester data demonstrates that our method significantly improves the network's energy efficiency and downlink throughput.

PPO-EPO: Energy and Performance Optimization for O-RAN Using Reinforcement Learning

TL;DR

Open RAN energy efficiency is addressed with a PPO-based reinforcement learning framework for traffic steering and selective cell shutdown, incorporating throughput, interference, and PRB constraints. The objective combines throughput and energy gains via with , and constraints ensure post-shutdown performance and resource limits. Training on a Turin-inspired CU–DU–RU topology with 12 RUs and 5 million PPO episodes, followed by validation on real data from the Viavi RIC Tester, demonstrates that PPO outperforms SARSA and Random in delivering higher downlink throughput and reduced energy consumption while honoring interference thresholds. The work underscores the practicality of RL for energy-aware O-RAN management and points to future avenues like Hybrid and Hierarchical RL for more robust, predictive traffic control.

Abstract

Energy consumption in mobile communication networks has become a significant challenge due to its direct impact on Capital Expenditure (CAPEX) and Operational Expenditure (OPEX). The introduction of Open RAN (O-RAN) enables telecommunication providers to leverage network intelligence to optimize energy efficiency while maintaining Quality of Service (QoS). One promising approach involves traffic-aware cell shutdown strategies, where underutilized cells are selectively deactivated without compromising overall network performance. However, achieving this balance requires precise traffic steering mechanisms that account for throughput performance, power efficiency, and network interference constraints. This work proposes a reinforcement learning (RL) model based on the Proximal Policy Optimization (PPO) algorithm to optimize traffic steering and energy efficiency. The objective is to maximize energy efficiency and performance gains while strategically shutting down underutilized cells. The proposed RL model learns adaptive policies to make optimal shutdown decisions by considering throughput degradation constraints, interference thresholds, and PRB utilization balance. Experimental validation using TeraVM Viavi RIC tester data demonstrates that our method significantly improves the network's energy efficiency and downlink throughput.

Paper Structure

This paper contains 8 sections, 18 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Network system design.
  • Figure 2: PPO training rewards for 5 million episodes
  • Figure 3: Distribution of both power and throughput gain showing the heatmap
  • Figure 4: The energy gain (The ratio of the rate and the power for the three methods)
  • Figure 5: CDF of the throughput of the network for the different methods
  • ...and 1 more figures