Cascade Reinforcement Learning with State Space Factorization for O-RAN-based Traffic Steering

Chuanneng Sun; Gueyoung Jung; Tuyen Xuan Tran; Dario Pompili

Cascade Reinforcement Learning with State Space Factorization for O-RAN-based Traffic Steering

Chuanneng Sun, Gueyoung Jung, Tuyen Xuan Tran, Dario Pompili

TL;DR

This work builds a data-driven and scalable RIC Digital Twin that is modeled using real-world data, including network setup, user geo-distribution, and traffic demand, among others, from a tier-1 RAN operator and evaluates the performance of CaRL in two areas in the Northeast US regions.

Abstract

The Open Radio Access Network (O-RAN) architecture empowers intelligent and automated optimization of the RAN through applications deployed on the RAN Intelligent Controller (RIC) platform, enabling capabilities beyond what is achievable with traditional RAN solutions. Within this paradigm, Traffic Steering (TS) emerges as a pivotal RIC application that focuses on optimizing cell-level mobility settings in near-real-time, aiming to significantly improve network spectral efficiency. In this paper, we design a novel TS algorithm based on a Cascade Reinforcement Learning (CaRL) framework. We propose state space factorization and policy decomposition to reduce the need for large models and well-labeled datasets. For each sub-state space, an RL sub-policy will be trained to learn an optimized mapping onto the action space. To apply CaRL on new network regions, we propose a knowledge transfer approach to initialize a new sub-policy based on knowledge learned by the trained policies. To evaluate CaRL, we build a data-driven and scalable RIC digital twin (DT) that is modeled using important real-world data, including network configuration, user geo-distribution, and traffic demand, among others, from a tier-1 mobile operator in the US. We evaluate CaRL on two DT scenarios representing two network clusters in two different cities and compare its performance with the business-as-usual (BAU) policy and other competing optimization approaches using heuristic and Q-table algorithms. Benchmarking results show that CaRL performs the best and improves the average cluster-aggregated downlink throughput over the BAU policy by 24% and 18% in these two scenarios, respectively.

Cascade Reinforcement Learning with State Space Factorization for O-RAN-based Traffic Steering

TL;DR

Abstract

Paper Structure (11 sections, 6 equations, 8 figures)

This paper contains 11 sections, 6 equations, 8 figures.

Introduction
Related Work
O-RAN Traffic Steering
Proposed Cascade RL Framework
Background and Notations
Cascade Reinforcement Learning
MDP Formulation
Performance Evaluation
Digital Twin Evaluation
Field Trial Evaluation
Conclusion

Figures (8)

Figure 1: (a) A visualization of the O-RAN facilitated traffic steering (TS) application. (b) Structure for the cascade policy. There are $M$ sub-policy neural networks corresponding to $M$ sub-spaces. (c) Digital twin architecture for evaluation.
Figure 2: Comparison between the real-world network trace data and the digital twin data for a specific cell in cluster 2 over a one-hour period for (a) sum downlink volume and (b) average number of RRC connected UEs. Due to proprietary restrictions, we cannot disclose the actual numbers of the volume, throughput, and the number of handovers. Instead, we normalized the values in this figure and provided the percentage.
Figure 3: Simulation results generated using RAN configuration data and traffic data from cluster 1. Due to proprietary restrictions, we cannot disclose the actual numbers of the volume, throughput, and the number of handovers. Instead, we normalized the values in this figure and provided the percentage.
Figure 4: Simulation results generated using RAN configuration data and traffic data from cluster 2. Due to proprietary restrictions, we cannot disclose the actual numbers of the volume, throughput, and the number of handovers. Instead, we normalized the values in this figure and provided the percentage.
Figure 5: Average connected UEs for three co-located cells in the same sector. Due to space limitations, we only present the results for one sector in each cluster. Due to proprietary restrictions, we cannot disclose the actual numbers of the volume, throughput, and the number of handovers. Instead, we normalized the values in this figure and provided the percentage.
...and 3 more figures

Cascade Reinforcement Learning with State Space Factorization for O-RAN-based Traffic Steering

TL;DR

Abstract

Cascade Reinforcement Learning with State Space Factorization for O-RAN-based Traffic Steering

Authors

TL;DR

Abstract

Table of Contents

Figures (8)