Large Language Model-enhanced Reinforcement Learning for Low-Altitude Economy Networking
Lingyi Cai, Ruichen Zhang, Changyuan Zhao, Yu Zhang, Jiawen Kang, Dusit Niyato, Tao Jiang, Xuemin Shen
TL;DR
This work addresses the challenge of enabling robust, energy-efficient aerial networking below 1,000 meters (LAENet) by integrating large language models (LLMs) with reinforcement learning (RL). It presents a tutorial and a novel LLM-enhanced RL framework in which LLMs act as information processors, reward designers, decision-makers, and generators to overcome RL limitations in generalization, reward design, and stability. A case study demonstrates that LLM-designed rewards improve learning efficiency and energy performance in a UAV-assisted IoT scenario, achieving up to 7.2% lower final energy for TD3 and significant gains across packet sizes. The paper highlights future directions toward modular LLM-RL agents, memory-enabled planning, and multi-agent LLM collaboration to enable scalable, intelligent aerial networking systems.
Abstract
Low-Altitude Economic Networking (LAENet) aims to support diverse flying applications below 1,000 meters by deploying various aerial vehicles for flexible and cost-effective aerial networking. However, complex decision-making, resource constraints, and environmental uncertainty pose significant challenges to the development of the LAENet. Reinforcement learning (RL) offers a potential solution in response to these challenges but has limitations in generalization, reward design, and model stability. The emergence of large language models (LLMs) offers new opportunities for RL to mitigate these limitations. In this paper, we first present a tutorial about integrating LLMs into RL by using the capacities of generation, contextual understanding, and structured reasoning of LLMs. We then propose an LLM-enhanced RL framework for the LAENet in terms of serving the LLM as information processor, reward designer, decision-maker, and generator. Moreover, we conduct a case study by using LLMs to design a reward function to improve the learning performance of RL in the LAENet. Finally, we provide a conclusion and discuss future work.
