PandORA: Automated Design and Comprehensive Evaluation of Deep Reinforcement Learning Agents for Open RAN

Maria Tsampazi; Salvatore D'Oro; Michele Polese; Leonardo Bonati; Gwenael Poitau; Michael Healy; Mohammad Alavirad; Tommaso Melodia

PandORA: Automated Design and Comprehensive Evaluation of Deep Reinforcement Learning Agents for Open RAN

Maria Tsampazi, Salvatore D'Oro, Michele Polese, Leonardo Bonati, Gwenael Poitau, Michael Healy, Mohammad Alavirad, Tommaso Melodia

TL;DR

PandORA is introduced, a framework to automatically design and train DRL agents for Open RAN applications, package them as xApps and evaluate them in the Colosseum wireless network emulator, indicating how suitable fine-tuning of the RAN control timers, as well as proper selection of reward designs and DRL architectures can boost network performance according to the network conditions and demand.

Abstract

The highly heterogeneous ecosystem of NextG wireless communication systems calls for novel networking paradigms where functionalities and operations can be dynamically and optimally reconfigured in real time to adapt to changing traffic conditions and satisfy stringent and diverse QoS demands. Open RAN technologies, and specifically those being standardized by the O-RAN Alliance, make it possible to integrate network intelligence into the once monolithic RAN via intelligent applications, namely, xApps and rApps. These applications enable flexible control of the network resources and functionalities, network management, and orchestration through data-driven intelligent control loops. Recent work has showed how DRL is effective in dynamically controlling O-RAN systems. However, how to design these solutions in a way that manages heterogeneous optimization goals and prevents unfair resource allocation is still an open challenge, with the logic within DRL agents often considered as a black box. In this paper, we introduce PandORA, a framework to automatically design and train DRL agents for Open RAN applications, package them as xApps and evaluate them in the Colosseum wireless network emulator. We benchmark $23$ xApps that embed DRL agents trained using different architectures, reward design, action spaces, and decision-making timescales, and with the ability to hierarchically control different network parameters. We test these agents on the Colosseum testbed under diverse traffic and channel conditions, in static and mobile setups. Our experimental results indicate how suitable fine-tuning of the RAN control timers, as well as proper selection of reward designs and DRL architectures can boost network performance according to the network conditions and demand. Notably, finer decision-making granularities can improve mMTC's performance by ~56% and even increase eMBB Throughput by ~99%.

PandORA: Automated Design and Comprehensive Evaluation of Deep Reinforcement Learning Agents for Open RAN

TL;DR

Abstract

xApps that embed DRL agents trained using different architectures, reward design, action spaces, and decision-making timescales, and with the ability to hierarchically control different network parameters. We test these agents on the Colosseum testbed under diverse traffic and channel conditions, in static and mobile setups. Our experimental results indicate how suitable fine-tuning of the RAN control timers, as well as proper selection of reward designs and DRL architectures can boost network performance according to the network conditions and demand. Notably, finer decision-making granularities can improve mMTC's performance by ~56% and even increase eMBB Throughput by ~99%.

Paper Structure (18 sections, 7 equations, 26 figures, 12 tables)

This paper contains 18 sections, 7 equations, 26 figures, 12 tables.

Introduction
Related Work
Contributions and Outline
The PandORA System Model and Evaluation Framework
System Model and Reference Use-Case Scenario
PandORA Overview and Procedures
DRL Agent Architectures tested in this work
DRL Optimization Strategies
Experimental Setup and DRL Training
In-Sample Experimental Evaluation
Impact of Discount Factor on the Action Space
Impact of Hierarchical Decision-Making
Impact of Per-Slice Scheduling Profile Selection
Impact of Weight Configuration
Impact of RAN Control Timers
...and 3 more sections

Figures (26)

Figure 1: PandORA framework for intent-driven DRL training, xApp on-boarding, and testing with Open RAN in Colosseum.
Figure 2: Reference O-RAN testing architecture with focus on the case of two xApps operating at different time scales, $T_{i}$, as described in Section \ref{['Section IV-B']}.
Figure 3: Reference O-RAN testing architecture with focus on the case of four xApps operating at time scale, $T$, as described in Section \ref{['Section IV-C']}.
Figure 4: Performance evaluation under different action spaces and values of the $\gamma$ parameter with the PPO DRL Architecture.
Figure 5: Median values under different action spaces and values of $\gamma$ with the PPO DRL Architecture.
...and 21 more figures

PandORA: Automated Design and Comprehensive Evaluation of Deep Reinforcement Learning Agents for Open RAN

TL;DR

Abstract

PandORA: Automated Design and Comprehensive Evaluation of Deep Reinforcement Learning Agents for Open RAN

Authors

TL;DR

Abstract

Table of Contents

Figures (26)