Intelligible Protocol Learning for Resource Allocation in 6G O-RAN Slicing
Farhad Rezazadeh, Hatim Chergui, Shuaib Siddiqui, Josep Mangues, Houbing Song, Walid Saad, Mehdi Bennis
TL;DR
This work tackles inter-slice contention in 6G O-RAN slicing by introducing STEP, a multi-agent DRL framework that couples deep Q-learning with an information bottleneck to drive emergent, concise inter-agent protocols. By enforcing an IB-driven stochastic bottleneck, STEP compresses state and message information through a latent representation, controlled by a KL-divergence term and a trade-off parameter $\beta$, while enabling dual-actions: resource allocation and communication signaling. In a three-slice inter-slice conflict scenario, STEP dramatically reduces conflicts (up to 6.06× vs predefined protocols and 3.4× vs MADRL baseline), lowers latency (median down to $0$ ms from $4$ ms), and improves CPU utilization (up to $1.4×$), validating both performance gains and protocol interpretability. The approach is designed to be O-RAN compliant, with deployment as rApps/xApps in the Near-RT RIC via E2 interfaces, and it outlines future directions in communication-space design, scalability through meta-learning, and cross-layer protocol integration, highlighting significant practical impact for dynamic, interoperable 6G network slicing.
Abstract
An adaptive standardized protocol is essential for addressing inter-slice resource contention and conflict in network slicing. Traditional protocol standardization is a cumbersome task that yields hardcoded predefined protocols, resulting in increased costs and delayed rollout. Going beyond these limitations, this paper proposes a novel multi-agent deep reinforcement learning (MADRL) communication framework called standalone explainable protocol (STEP) for future sixth-generation (6G) open radio access network (O-RAN) slicing. As new conditions arise and affect network operation, resource orchestration agents adapt their communication messages to promote the emergence of a protocol on-the-fly, which enables the mitigation of conflict and resource contention between network slices. STEP weaves together the notion of information bottleneck (IB) theory with deep Q-network (DQN) learning concepts. By incorporating a stochastic bottleneck layer -- inspired by variational autoencoders (VAEs) -- STEP imposes an information-theoretic constraint for emergent inter-agent communication. This ensures that agents exchange concise and meaningful information, preventing resource waste and enhancing the overall system performance. The learned protocols enhance interpretability, laying a robust foundation for standardizing next-generation 6G networks. By considering an O-RAN compliant network slicing resource allocation problem, a conflict resolution protocol is developed. In particular, the results demonstrate that, on average, STEP reduces inter-slice conflicts by up to 6.06x compared to a predefined protocol method. Furthermore, in comparison with an MADRL baseline, STEP achieves 1.4x and 3.5x lower resource underutilization and latency, respectively.
