ScalO-RAN: Energy-aware Network Intelligence Scaling in Open RAN

Stefano Maxenti; Salvatore D'Oro; Leonardo Bonati; Michele Polese; Antonio Capone; Tommaso Melodia

ScalO-RAN: Energy-aware Network Intelligence Scaling in Open RAN

Stefano Maxenti, Salvatore D'Oro, Leonardo Bonati, Michele Polese, Antonio Capone, Tommaso Melodia

TL;DR

ScalO-RAN tackles the problem of supplying latency-guaranteed, energy-efficient scaling for AI-powered O-RAN apps in a multi-tenant Open RAN. It combines a data-driven two-segment piecewise linear latency model with a mixed-integer optimization (ISP) that jointly decides which servers to activate and how to place xApps, rApps, and dApps to satisfy per-app latency and demand while maximizing profit and minimizing energy. The approach is validated via a concrete OpenShift prototype and numerical experiments, demonstrating that latency constraints drive scaling decisions and that ScalO-RAN achieves lower energy consumption and better latency compliance than traditional load-balancing approaches. This work advances practical, energy-conscious orchestration for cloudified RAN control planes, with clear implications for scalable 5G+/6G deployments under multi-tenant, multi-vendor environments.

Abstract

Network virtualization, software-defined infrastructure, and orchestration are pivotal elements in contemporary networks, yielding new vectors for optimization and novel capabilities. In line with these principles, O-RAN presents an avenue to bypass vendor lock-in, circumvent vertical configurations, enable network programmability, and facilitate integrated Artificial Intelligence (AI) support. Moreover, modern container orchestration frameworks (e.g., Kubernetes, Red Hat OpenShift) simplify the way cellular base stations, as well as the newly introduced RAN Intelligent Controllers (RICs), are deployed, managed, and orchestrated. While this enables cost reduction via infrastructure sharing, it also makes it more challenging to meet O-RAN control latency requirements, especially during peak resource utilization. To address this problem, we propose ScalO-RAN, a control framework rooted in optimization and designed as an O-RAN rApp that allocates and scales AI-based O-RAN applications (xApps, rApps, dApps) to: (i) abide by application-specific latency requirements, and (ii) monetize the shared infrastructure while reducing energy consumption. We prototype ScalO-RAN on an OpenShift cluster with base stations, RIC, and a set of AI-based xApps deployed as micro-services. We evaluate ScalO-RAN both numerically and experimentally. Our results show that ScalO-RAN can optimally allocate and distribute O-RAN applications within available computing nodes to accommodate even stringent latency requirements. More importantly, we show that scaling O-RAN applications is primarily a time-constrained problem rather than a resource-constrained one, where scaling policies must account for stringent inference time of AI applications, and not only how many resources they consume.

ScalO-RAN: Energy-aware Network Intelligence Scaling in Open RAN

TL;DR

Abstract

Paper Structure (15 sections, 1 theorem, 9 equations, 13 figures, 1 table)

This paper contains 15 sections, 1 theorem, 9 equations, 13 figures, 1 table.

Introduction
Related Work
ScalO-RAN Architecture and Prototype
System Model
Notation and variables
Inference time of O-RAN Applications
Profiling inference time
Deriving a latency model
ScalO-RAN Optimization Engine
Formulating the problem
Objective function design
Computational Complexity
Performance Evaluation
Experimental Results
Conclusions

Key Result

Theorem 1

Problem eq:problem is NP-hard.

Figures (13)

Figure 1: Left: execution vs. queuing times for different xApp number. The number of xApps is indicated by the color map. Right: inference time vs. number of xApps. Shaded areas represent $1$ s threshold.
Figure 2: ScalO-RAN within the O-RAN architecture.
Figure 3: ScalO-RAN OpenShift-based prototype.
Figure 4: Inference time vs. CPU and RAM usage for different number of xApps. The color map indicates the number of xApps.
Figure 5: Left: execution time vs. queuing time for different number of xApps. The color map indicates the number of xApps. Right: inference time vs. number of xApps. $\otimes$ represents the break point of the piecewise linearization functions.
...and 8 more figures

Theorems & Definitions (1)

Theorem 1

ScalO-RAN: Energy-aware Network Intelligence Scaling in Open RAN

TL;DR

Abstract

ScalO-RAN: Energy-aware Network Intelligence Scaling in Open RAN

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (1)