Mobile Cell-Free Massive MIMO with Multi-Agent Reinforcement Learning: A Scalable Framework
Ziheng Liu, Jiayi Zhang, Yiyang Zhu, Enyu Shi, Bo Ai
TL;DR
This work tackles mobility-enabled cell-free mMIMO by introducing UAV-like mobile-APs and jointly optimizing their movement and downlink power. It develops a scalable MARL framework that combines GNN-aided communication, a permutation-based architecture for dimension reduction, and a directional decoupling strategy to allocate rewards by each agent's contribution. The proposed SF-MADDPG framework, including dynamic and hyper permutation networks plus credit assignment via ARN and HRN, achieves substantial sum SE gains and faster convergence compared to baselines, while reducing observation-space complexity; results indicate linear SE growth with system size and robustness to evolving channels through online learning. These findings highlight a practical, scalable route to high-quality, uniform service in dense, mobile, cell-free networks, with potential impact on 6G-era deployments.
Abstract
Cell-free massive multiple-input multiple-output (mMIMO) offers significant advantages in mobility scenarios, mainly due to the elimination of cell boundaries and strong macro diversity. In this paper, we examine the downlink performance of cell-free mMIMO systems equipped with mobile-APs utilizing the concept of unmanned aerial vehicles, where mobility and power control are jointly considered to effectively enhance coverage and suppress interference. However, the high computational complexity, poor collaboration, limited scalability, and uneven reward distribution of conventional optimization schemes lead to serious performance degradation and instability. These factors complicate the provision of consistent and high-quality service across all user equipments in downlink cell-free mMIMO systems. Consequently, we propose a novel scalable framework enhanced by multi-agent reinforcement learning (MARL) to tackle these challenges. The established framework incorporates a graph neural network (GNN)-aided communication mechanism to facilitate effective collaboration among agents, a permutation architecture to improve scalability, and a directional decoupling architecture to accurately distinguish contributions. In the numerical results, we present comparisons of different optimization schemes and network architectures, which reveal that the proposed scheme can effectively enhance system performance compared to conventional schemes due to the adoption of advanced technologies. In particular, appropriately compressing the observation space of agents is beneficial for achieving a better balance between performance and convergence.
