Data-driven Bandwidth Adaptation for Radio Access Network Slices
Panagiotis Nikolaidis, Asim Zoulkarni, John Baras
TL;DR
The paper tackles SLA-driven QoS for multiple RAN slices by addressing packet-delay guarantees with a data-driven Bandwidth Demand Estimator (BDE). It introduces a two-function BS architecture (BDE and Network Slice Multiplexer) and casts the BDE bandwidth decision as an infinite-horizon DP, solved via model-based reinforcement learning with periodic transition estimation and VI, yielding an $\epsilon$-soft policy. A key contribution is per-slice learning and a practical testbed evaluation on a 3GPP LTE Amarisoft system, showing substantial bandwidth savings while meeting per-slice delay targets $Q_i(t)\le Q_c$ and SLA probabilities $P_i$. The results demonstrate scalability and real-world viability for dynamic PRB allocation, leveraging domain insights like cost monotonicity to accelerate learning. Overall, the work provides a scalable, data-driven approach to satisfy SLAs in multi-slice RANs with non-trivial delay QoS requirements.
Abstract
The need to satisfy the QoS requirements of multiple network slices deployed at the same base station poses a major challenge to network operators. The problem becomes even harder when the desired QoS involves packet delays. In that case, network utility maximization is not directly applicable since the utilities of the slices are unknown. As a result, most related works learn online the utilities of all slices and how to split the resources among them. Unfortunately, this approach does not scale well for many slices. Instead, it is needed to perform learning separately for each slice. To this end, we develop a bandwidth demand estimator; a network function that periodically receives as input the traffic of the slice and outputs the amount of bandwidth that its MAC scheduler needs to deliver the desired QoS. We develop the bandwidth demand estimator for QoS involving packet delay metrics based on a model-based reinforcement learning algorithm. We implement the algorithm on a cellular testbed and conduct experiments with time-varying traffic loads. Results show that the algorithm delivers the desired QoS but with significantly less bandwidth than non-adaptive approaches and other baseline online learning algorithms.
