Table of Contents
Fetching ...

Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial

Yousef Emami, Hao Zhou, Radha Reddy, Atefeh Hajijamali Arani, Biliang Wang, Kai Li, Luis Almeida, Zhu Han

TL;DR

This survey investigates how Large Language Models (LLMs) and Multimodal LLMs (MLLMs) can augment UAV operations and communications, proposing a unified framework that links perception, planning, and control through cross-modal reasoning. It taxonomyzes LLM adaptation (pretraining, fine-tuning, RAG, and prompting) and highlights practical deployment across edge, on-device, and hybrid architectures, including techniques to extend context and ground outputs in domain knowledge. The paper then details LLM-enabled UAV navigation, swarm coordination, safety, network optimization, and the emergence of MLLMs for vision–language guidance, with case studies, benchmarks (e.g., UAVBench, UAVThreatBench, AirCopBench), and a discussion of ethical, safety, and environmental considerations. Finally, it outlines near-, mid-, and long-term trends toward domain-specific UAV LLMs, edge-enabled inference, self-improving systems, and sustainable AI infrastructure, underscoring the practical and societal impact of intelligent, responsible aerial systems.

Abstract

Uncrewed Aerial Vehicles (UAVs) are widely deployed across diverse applications due to their mobility and agility. Recent advances in Large Language Models (LLMs) offer a transformative opportunity to enhance UAV intelligence beyond conventional optimization-based and learning-based approaches. By integrating LLMs into UAV systems, advanced environmental understanding, swarm coordination, mobility optimization, and high-level task reasoning can be achieved, thereby allowing more adaptive and context-aware aerial operations. This survey systematically explores the intersection of LLMs and UAV technologies and proposes a unified framework that consolidates existing architectures, methodologies, and applications for UAVs. We first present a structured taxonomy of LLM adaptation techniques for UAVs, including pretraining, fine-tuning, Retrieval-Augmented Generation (RAG), and prompt engineering, along with key reasoning capabilities such as Chain-of-Thought (CoT) and In-Context Learning (ICL). We then examine LLM-assisted UAV communications and operations, covering navigation, mission planning, swarm control, safety, autonomy, and network management. After that, the survey further discusses Multimodal LLMs (MLLMs) for human-swarm interaction, perception-driven navigation, and collaborative control. Finally, we address ethical considerations, including bias, transparency, accountability, and Human-in-the-Loop (HITL) strategies, and outline future research directions. Overall, this work positions LLM-assisted UAVs as a foundation for intelligent and adaptive aerial systems.

Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial

TL;DR

This survey investigates how Large Language Models (LLMs) and Multimodal LLMs (MLLMs) can augment UAV operations and communications, proposing a unified framework that links perception, planning, and control through cross-modal reasoning. It taxonomyzes LLM adaptation (pretraining, fine-tuning, RAG, and prompting) and highlights practical deployment across edge, on-device, and hybrid architectures, including techniques to extend context and ground outputs in domain knowledge. The paper then details LLM-enabled UAV navigation, swarm coordination, safety, network optimization, and the emergence of MLLMs for vision–language guidance, with case studies, benchmarks (e.g., UAVBench, UAVThreatBench, AirCopBench), and a discussion of ethical, safety, and environmental considerations. Finally, it outlines near-, mid-, and long-term trends toward domain-specific UAV LLMs, edge-enabled inference, self-improving systems, and sustainable AI infrastructure, underscoring the practical and societal impact of intelligent, responsible aerial systems.

Abstract

Uncrewed Aerial Vehicles (UAVs) are widely deployed across diverse applications due to their mobility and agility. Recent advances in Large Language Models (LLMs) offer a transformative opportunity to enhance UAV intelligence beyond conventional optimization-based and learning-based approaches. By integrating LLMs into UAV systems, advanced environmental understanding, swarm coordination, mobility optimization, and high-level task reasoning can be achieved, thereby allowing more adaptive and context-aware aerial operations. This survey systematically explores the intersection of LLMs and UAV technologies and proposes a unified framework that consolidates existing architectures, methodologies, and applications for UAVs. We first present a structured taxonomy of LLM adaptation techniques for UAVs, including pretraining, fine-tuning, Retrieval-Augmented Generation (RAG), and prompt engineering, along with key reasoning capabilities such as Chain-of-Thought (CoT) and In-Context Learning (ICL). We then examine LLM-assisted UAV communications and operations, covering navigation, mission planning, swarm control, safety, autonomy, and network management. After that, the survey further discusses Multimodal LLMs (MLLMs) for human-swarm interaction, perception-driven navigation, and collaborative control. Finally, we address ethical considerations, including bias, transparency, accountability, and Human-in-the-Loop (HITL) strategies, and outline future research directions. Overall, this work positions LLM-assisted UAVs as a foundation for intelligent and adaptive aerial systems.
Paper Structure (75 sections, 10 figures, 13 tables)

This paper contains 75 sections, 10 figures, 13 tables.

Figures (10)

  • Figure 1: Overall organization of the survey, illustrating the research foundations, LLM adaptation stack, core application domains (LLM-enabled communications and MLLM-driven intelligence), and responsible and future intelligence considerations for UAV systems.
  • Figure 2: LLM fine-tuning pipeline for UAV applications, showing the progression from a pretrained language model through UAV-specific data preparation, model and training setup, parameter-efficient fine-tuning (e.g., adapters, LoRA/QLoRA, and prompt tuning), followed by evaluation, deployment, continuous monitoring, and updates to produce a UAV-adapted LLM.
  • Figure 3: Conventional RAG pipeline for UAV applications, illustrating the workflow from UAV task/query formulation through task-aware query encoding, hybrid knowledge retrieval (vector, graph, and hybrid RAG), context augmentation, and grounded LLM-assisted response generation using domain-specific UAV knowledge sources.
  • Figure 4: Prompt engineering strategies for UAV applications, comparing in-context learning, chain-of-thought prompting, prompt-based planning, and self-refinement in terms of prompt structure, LLM roles, and resulting capabilities for reliable, scalable, and robust UAV decision-making.
  • Figure 5: Diagram of the LLM-assisted UAV navigation, planning, and placement framework, illustrating the integration of mission context, reasoning, decision-making, and multi-agent coordination for autonomous operations.
  • ...and 5 more figures