Large Language Model (LLM)-enabled Reinforcement Learning for Wireless Network Optimization
Jie Zheng, Ruichen Zhang, Dusit Niyato, Haijun Zhang, Jiacheng Wang, Hongyang Du, Jiawen Kang, Zehui Xiong
TL;DR
This work tackles optimization of 6G wireless networks under diverse user demands by integrating large language models (LLMs) with reinforcement learning (RL). It proposes a framework that classifies LLM roles within the RL loop—feature extraction, reward design, policy interpretation, and decision-making—and applies them across physical to application layers. A novel LESR-enabled MARL framework for UAV–satellite service migration uses state representations and semantic reasoning to generate intrinsic rewards and guide decisions, achieving notable gains in simulation. The paper also discusses cross-layer challenges, lessons learned, and future directions for secure, low-overhead, end-to-end LLM-assisted RL in wireless networks.
Abstract
Enhancing future wireless networks presents a significant challenge for networking systems due to diverse user demands and the emergence of 6G technology. While reinforcement learning (RL) is a powerful framework, it often encounters difficulties with high-dimensional state spaces and complex environments, leading to substantial computational demands, distributed intelligence, and potentially inconsistent outcomes. Large language models (LLMs), with their extensive pretrained knowledge and advanced reasoning capabilities, offer promising tools to enhance RL in optimizing 6G wireless networks. We explore RL models augmented by LLMs, emphasizing their roles and the potential benefits of their synergy in wireless network optimization. We then examine LLM-enabled RL across various protocol layers: physical, data link, network, transport, and application layers. Additionally, we propose an LLM-assisted state representation and semantic extraction to enhance the multi-agent reinforcement learning (MARL) framework. This approach is applied to service migration and request routing, as well as topology graph generation in unmanned aerial vehicle (UAV)-satellite networks. Through case studies, we demonstrate that our framework effectively performs optimization of wireless network. Finally, we outline prospective research directions for LLM-enabled RL in wireless network optimization.
