Interpretable Attention-Based Multi-Agent PPO for Latency Spike Resolution in 6G RAN Slicing

Kavan Fatehi; Mostafa Rahmani Ghourtani; Amir Sonee; Poonam Yadav; Alessandra M Russo; Hamed Ahmadi; Radu Calinescu

Interpretable Attention-Based Multi-Agent PPO for Latency Spike Resolution in 6G RAN Slicing

Kavan Fatehi, Mostafa Rahmani Ghourtani, Amir Sonee, Poonam Yadav, Alessandra M Russo, Hamed Ahmadi, Radu Calinescu

TL;DR

This work tackles sudden latency spikes in 6G RAN slicing under strict SLAs by introducing AE-MAPPO, an interpretable, attention-enhanced multi-agent PPO framework. It embeds six specialized attention heads into the policy to produce faithful explanations at inference time, while operating across O-RAN timescales with predictive, reactive, and inter-slice phases. The approach jointly optimizes performance and explainability via a novel utility that blends QoS satisfaction, efficiency, fairness, and an explainability term, ensuring transparent decisions. In a URLLC case study, AE-MAPPO resolves spikes within $18$ ms, achieves URLLC latency $0.98$ ms with $99.9999\%$ reliability, and reduces troubleshooting time by $93\%$, demonstrating practical, trustworthy automation for 6G RAN slicing.

Abstract

Sixth-generation (6G) radio access networks (RANs) must enforce strict service-level agreements (SLAs) for heterogeneous slices, yet sudden latency spikes remain difficult to diagnose and resolve with conventional deep reinforcement learning (DRL) or explainable RL (XRL). We propose \emph{Attention-Enhanced Multi-Agent Proximal Policy Optimization (AE-MAPPO)}, which integrates six specialized attention mechanisms into multi-agent slice control and surfaces them as zero-cost, faithful explanations. The framework operates across O-RAN timescales with a three-phase strategy: predictive, reactive, and inter-slice optimization. A URLLC case study shows AE-MAPPO resolves a latency spike in $18$ms, restores latency to $0.98$ms with $99.9999\%$ reliability, and reduces troubleshooting time by $93\%$ while maintaining eMBB and mMTC continuity. These results confirm AE-MAPPO's ability to combine SLA compliance with inherent interpretability, enabling trustworthy and real-time automation for 6G RAN slicing.

Interpretable Attention-Based Multi-Agent PPO for Latency Spike Resolution in 6G RAN Slicing

TL;DR

ms, achieves URLLC latency

ms with

reliability, and reduces troubleshooting time by

, demonstrating practical, trustworthy automation for 6G RAN slicing.

Abstract

ms, restores latency to

ms with

reliability, and reduces troubleshooting time by

while maintaining eMBB and mMTC continuity. These results confirm AE-MAPPO's ability to combine SLA compliance with inherent interpretability, enabling trustworthy and real-time automation for 6G RAN slicing.

Paper Structure (15 sections, 11 equations, 4 figures, 2 tables)

This paper contains 15 sections, 11 equations, 4 figures, 2 tables.

Introduction
Problem Formulation and System Model
System and QoS Targets
Joint Resource Allocation
QoS satisfaction
Efficiency
Fairness
Explainability Utility
Proposed Framework: AE-MAPPO
Interpretable Policy with Six Attentions
Three-Phase Allocation Across O-RAN Loops
Learning Objective
Case Study: Latency Spike Resolution
Performance Evaluation
Conclusion

Figures (4)

Figure 1: AE-MAPPO architecture integrating six attention mechanisms with three-phase allocation strategy for explainable network slicing.
Figure 2: Multi-dimensional performance comparison across six critical metrics. AE-MAPPO (green) achieves the largest coverage area (86%), demonstrating consistent superiority across all dimensions while maintaining explainability.
Figure 3: Performance-explainability Pareto frontier. AE-MAPPO uniquely achieves the ideal region (green area), demonstrating that the traditional trade-off assumption (gray dashed line) results from architectural choices rather than fundamental constraints.
Figure 4: AE-MAPPO failure recovery timeline demonstrating response to critical URLLC QoS violation. The system leverages sequential attention mechanism activation for detection, diagnosis, and recovery, completing the entire process within 80ms while providing interpretable explanations at each stage.

Interpretable Attention-Based Multi-Agent PPO for Latency Spike Resolution in 6G RAN Slicing

TL;DR

Abstract

Interpretable Attention-Based Multi-Agent PPO for Latency Spike Resolution in 6G RAN Slicing

Authors

TL;DR

Abstract

Table of Contents

Figures (4)