Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial
Yousef Emami, Hao Zhou, Radha Reddy, Atefeh Hajijamali Arani, Biliang Wang, Kai Li, Luis Almeida, Zhu Han
TL;DR
This survey investigates how Large Language Models (LLMs) and Multimodal LLMs (MLLMs) can augment UAV operations and communications, proposing a unified framework that links perception, planning, and control through cross-modal reasoning. It taxonomyzes LLM adaptation (pretraining, fine-tuning, RAG, and prompting) and highlights practical deployment across edge, on-device, and hybrid architectures, including techniques to extend context and ground outputs in domain knowledge. The paper then details LLM-enabled UAV navigation, swarm coordination, safety, network optimization, and the emergence of MLLMs for vision–language guidance, with case studies, benchmarks (e.g., UAVBench, UAVThreatBench, AirCopBench), and a discussion of ethical, safety, and environmental considerations. Finally, it outlines near-, mid-, and long-term trends toward domain-specific UAV LLMs, edge-enabled inference, self-improving systems, and sustainable AI infrastructure, underscoring the practical and societal impact of intelligent, responsible aerial systems.
Abstract
Uncrewed Aerial Vehicles (UAVs) are widely deployed across diverse applications due to their mobility and agility. Recent advances in Large Language Models (LLMs) offer a transformative opportunity to enhance UAV intelligence beyond conventional optimization-based and learning-based approaches. By integrating LLMs into UAV systems, advanced environmental understanding, swarm coordination, mobility optimization, and high-level task reasoning can be achieved, thereby allowing more adaptive and context-aware aerial operations. This survey systematically explores the intersection of LLMs and UAV technologies and proposes a unified framework that consolidates existing architectures, methodologies, and applications for UAVs. We first present a structured taxonomy of LLM adaptation techniques for UAVs, including pretraining, fine-tuning, Retrieval-Augmented Generation (RAG), and prompt engineering, along with key reasoning capabilities such as Chain-of-Thought (CoT) and In-Context Learning (ICL). We then examine LLM-assisted UAV communications and operations, covering navigation, mission planning, swarm control, safety, autonomy, and network management. After that, the survey further discusses Multimodal LLMs (MLLMs) for human-swarm interaction, perception-driven navigation, and collaborative control. Finally, we address ethical considerations, including bias, transparency, accountability, and Human-in-the-Loop (HITL) strategies, and outline future research directions. Overall, this work positions LLM-assisted UAVs as a foundation for intelligent and adaptive aerial systems.
