Table of Contents
Fetching ...

Extended Reality (XR) Codec Adaptation in 5G using Multi-Agent Reinforcement Learning with Attention Action Selection

Pedro Enrique Iturria-Rivera, Raimundas Gaigalas, Medhat Elsayed, Majid Bavand, Yigit Ozcan, Melike Erol-Kantarci

TL;DR

This work tackles cross-layer optimization of XR codec adaptation over 5G/6G networks by formulating a cooperative multi-agent reinforcement learning (MARL) approach. It introduces Optimistic QMIX (oQMIX) with an attention-based slate action mechanism within a Slate-Dec-POMDP to coordinate AR, VR, and Cloud Gaming traffic for improved XR QoE. A tailored XR QoE framework (XQI) and a reward structure align learning with practical KPIs, and simulations show oQMIX outperforms the APS baseline by substantial margins in XR index, jitter, delay, and PLR, while keeping goodput stable. The findings suggest that cross-layer MARL with attention can robustly adapt to channel conditions and traffic mix, offering a practical route to enhanced XR experiences in future wireless networks.

Abstract

Extended Reality (XR) services will revolutionize applications over 5th and 6th generation wireless networks by providing seamless virtual and augmented reality experiences. These applications impose significant challenges on network infrastructure, which can be addressed by machine learning algorithms due to their adaptability. This paper presents a Multi- Agent Reinforcement Learning (MARL) solution for optimizing codec parameters of XR traffic, comparing it to the Adjust Packet Size (APS) algorithm. Our cooperative multi-agent system uses an Optimistic Mixture of Q-Values (oQMIX) approach for handling Cloud Gaming (CG), Augmented Reality (AR), and Virtual Reality (VR) traffic. Enhancements include an attention mechanism and slate-Markov Decision Process (MDP) for improved action selection. Simulations show our solution outperforms APS with average gains of 30.1%, 15.6%, 16.5% 50.3% in XR index, jitter, delay, and Packet Loss Ratio (PLR), respectively. APS tends to increase throughput but also packet losses, whereas oQMIX reduces PLR, delay, and jitter while maintaining goodput.

Extended Reality (XR) Codec Adaptation in 5G using Multi-Agent Reinforcement Learning with Attention Action Selection

TL;DR

This work tackles cross-layer optimization of XR codec adaptation over 5G/6G networks by formulating a cooperative multi-agent reinforcement learning (MARL) approach. It introduces Optimistic QMIX (oQMIX) with an attention-based slate action mechanism within a Slate-Dec-POMDP to coordinate AR, VR, and Cloud Gaming traffic for improved XR QoE. A tailored XR QoE framework (XQI) and a reward structure align learning with practical KPIs, and simulations show oQMIX outperforms the APS baseline by substantial margins in XR index, jitter, delay, and PLR, while keeping goodput stable. The findings suggest that cross-layer MARL with attention can robustly adapt to channel conditions and traffic mix, offering a practical route to enhanced XR experiences in future wireless networks.

Abstract

Extended Reality (XR) services will revolutionize applications over 5th and 6th generation wireless networks by providing seamless virtual and augmented reality experiences. These applications impose significant challenges on network infrastructure, which can be addressed by machine learning algorithms due to their adaptability. This paper presents a Multi- Agent Reinforcement Learning (MARL) solution for optimizing codec parameters of XR traffic, comparing it to the Adjust Packet Size (APS) algorithm. Our cooperative multi-agent system uses an Optimistic Mixture of Q-Values (oQMIX) approach for handling Cloud Gaming (CG), Augmented Reality (AR), and Virtual Reality (VR) traffic. Enhancements include an attention mechanism and slate-Markov Decision Process (MDP) for improved action selection. Simulations show our solution outperforms APS with average gains of 30.1%, 15.6%, 16.5% 50.3% in XR index, jitter, delay, and Packet Loss Ratio (PLR), respectively. APS tends to increase throughput but also packet losses, whereas oQMIX reduces PLR, delay, and jitter while maintaining goodput.
Paper Structure (17 sections, 7 equations, 5 figures, 3 tables, 2 algorithms)

This paper contains 17 sections, 7 equations, 5 figures, 3 tables, 2 algorithms.

Figures (5)

  • Figure 1: An AI-powered application server exchanges aggregated KPIs from the BS and UEs to decide on the proper XR and CG codec parameters to satisfy XR/CG QoE requirements.
  • Figure 2: $\bm{(a)}$ Illustration of user locations in three coverage regions. When users are located in outer rings the solution becomes harder due to the reduced action set that satisfies QoE requirements. $\bm{(b)}$ Overview of the oQMIX algorithm with action prohibition.
  • Figure 3: Convergence performance for QMIX: $\bm{(a)}$ 200 m, $\bm{(b)}$ 300 m and $\bm{(c)}$ 400 m and oQMIX: $\bm{(d)}$ 200 m, $\bm{(e)}$ 300 m and $\bm{(f)}$ 400 m. The left y-axis and right y-axis in each subfigure indicate the team reward and $\%$ of success of each algorithm, respectively.
  • Figure 4: Key Performance Indicators of interest vs. Distance $\bm{(a)}$ XR index, $\bm{(b)}$ Jitter, $\bm{(c)}$ Delay, and $\bm{(d)}$ Packet Loss Ratio
  • Figure 5: Flow-based performance of APS, QMIX and oQMIX vs. Distance $\bm{(a)}$ Throughput and $\bm{(b)}$ Goodput