DRL-based Latency-Aware Network Slicing in O-RAN with Time-Varying SLAs

Raoul Raftopoulos; Salvatore D'Oro; Tommaso Melodia; Giovanni Schembra

DRL-based Latency-Aware Network Slicing in O-RAN with Time-Varying SLAs

Raoul Raftopoulos, Salvatore D'Oro, Tommaso Melodia, Giovanni Schembra

TL;DR

This work tackles latency-aware network slicing in O-RAN under time-varying SLAs by designing a PPO-based DRL xApp that dynamically allocates PRBs to slices. The agent observes compact, latency-correlated KPMs plus SLA parameters and optimizes PRB usage to meet SLA targets, using a sigmoid-based reward to balance SLA satisfaction and resource efficiency. Training and evaluation happen in OpenRAN Gym and Colosseum, comparing against DQN and Q-Learning, with results showing substantial SLA-violation reductions (up to $8.3\times$ to $14.4\times$) and PRB savings (up to $0.3\times$ to $0.6\times$ of baseline usage). The work demonstrates practical O-RAN integration via Near-RT RIC, NWDAF enrichment, and E2 interfaces, supporting adaptive, non-retraining SLA handling for latency-sensitive slices.

Abstract

The Open Radio Access Network (Open RAN) paradigm, and its reference architecture proposed by the O-RAN Alliance, is paving the way toward open, interoperable, observable and truly intelligent cellular networks. Crucial to this evolution is Machine Learning (ML), which will play a pivotal role by providing the necessary tools to realize the vision of self-organizing O-RAN systems. However, to be actionable, ML algorithms need to demonstrate high reliability, effectiveness in delivering high performance, and the ability to adapt to varying network conditions, traffic demands and performance requirements. To address these challenges, in this paper we propose a novel Deep Reinforcement Learning (DRL) agent design for O-RAN applications that can learn control policies under varying Service Level Agreement (SLAs) with heterogeneous minimum performance requirements. We focus on the case of RAN slicing and SLAs specifying maximum tolerable end-to-end latency levels. We use the OpenRAN Gym open-source environment to train a DRL agent that can adapt to varying SLAs and compare it against the state-of-the-art. We show that our agent maintains a low SLA violation rate that is 8.3x and 14.4x lower than approaches based on Deep Q- Learning (DQN) and Q-Learning while consuming respectively 0.3x and 0.6x fewer resources without the need for re-training.

DRL-based Latency-Aware Network Slicing in O-RAN with Time-Varying SLAs

TL;DR

) and PRB savings (up to

of baseline usage). The work demonstrates practical O-RAN integration via Near-RT RIC, NWDAF enrichment, and E2 interfaces, supporting adaptive, non-retraining SLA handling for latency-sensitive slices.

Abstract

Paper Structure (12 sections, 4 equations, 11 figures)

This paper contains 12 sections, 4 equations, 11 figures.

Introduction
System Model
DRL agent design for adaptive latency-aware slicing
Proximal Policy Optimization (PPO) architecture
O-RAN Integration and Inference
O-RAN architecture overview
O-RAN Integration
Experimental Setup and Data collection
Numerical Results
STAT Scenario: Performance evaluation and comparison
DYN Scenario: performance evaluation and comparison
Conclusions and Future Work

Figures (11)

Figure 1: O-RAN architecture with the Integration of the proposed DRL agent.
Figure 2: O-RAN testbed setup and cellular scenario used for data collection and testing of our DRL agent.
Figure 3: Reward of the first network slice in the first scenario during training ($\Lambda_1$ = 110ms, $\varphi_{1}^{(\mathrm{SLA})} = 0.99$).
Figure 4: Violation threshold rate of the first network slice in the first scenario during training ($\Lambda_1$ = 110ms, $\varphi_{1}^{(\mathrm{SLA})} = 0.99$).
Figure 5: Reward of the second network slice in the first scenario during training ($\Lambda_2 = 50$ms, $\varphi_{2}^{(\mathrm{SLA})} = 0.99$).
...and 6 more figures

DRL-based Latency-Aware Network Slicing in O-RAN with Time-Varying SLAs

TL;DR

Abstract

DRL-based Latency-Aware Network Slicing in O-RAN with Time-Varying SLAs

Authors

TL;DR

Abstract

Table of Contents

Figures (11)