Interpretable Attention-Based Multi-Agent PPO for Latency Spike Resolution in 6G RAN Slicing
Kavan Fatehi, Mostafa Rahmani Ghourtani, Amir Sonee, Poonam Yadav, Alessandra M Russo, Hamed Ahmadi, Radu Calinescu
TL;DR
This work tackles sudden latency spikes in 6G RAN slicing under strict SLAs by introducing AE-MAPPO, an interpretable, attention-enhanced multi-agent PPO framework. It embeds six specialized attention heads into the policy to produce faithful explanations at inference time, while operating across O-RAN timescales with predictive, reactive, and inter-slice phases. The approach jointly optimizes performance and explainability via a novel utility that blends QoS satisfaction, efficiency, fairness, and an explainability term, ensuring transparent decisions. In a URLLC case study, AE-MAPPO resolves spikes within $18$ ms, achieves URLLC latency $0.98$ ms with $99.9999\%$ reliability, and reduces troubleshooting time by $93\%$, demonstrating practical, trustworthy automation for 6G RAN slicing.
Abstract
Sixth-generation (6G) radio access networks (RANs) must enforce strict service-level agreements (SLAs) for heterogeneous slices, yet sudden latency spikes remain difficult to diagnose and resolve with conventional deep reinforcement learning (DRL) or explainable RL (XRL). We propose \emph{Attention-Enhanced Multi-Agent Proximal Policy Optimization (AE-MAPPO)}, which integrates six specialized attention mechanisms into multi-agent slice control and surfaces them as zero-cost, faithful explanations. The framework operates across O-RAN timescales with a three-phase strategy: predictive, reactive, and inter-slice optimization. A URLLC case study shows AE-MAPPO resolves a latency spike in $18$ms, restores latency to $0.98$ms with $99.9999\%$ reliability, and reduces troubleshooting time by $93\%$ while maintaining eMBB and mMTC continuity. These results confirm AE-MAPPO's ability to combine SLA compliance with inherent interpretability, enabling trustworthy and real-time automation for 6G RAN slicing.
