Table of Contents
Fetching ...

A Survey on DRL based UAV Communications and Networking: DRL Fundamentals, Applications and Implementations

Wei Zhao, Shaoxin Cui, Wen Qiu, Zhiqiang He, Zhi Liu, Xiao Zheng, Bomin Mao, Nei Kato

TL;DR

The surveyed work addresses the challenge of optimizing UAV communications in dynamic, multi-agent wireless networks by framing key problems (power allocation, channel assignment, caching, and task offloading) as optimization models solvable with deep reinforcement learning. It systematically reviews DRL fundamentals (MDP, value-based, policy-based, actor-critic, and multi-agent variants) and maps them to UAV applications, highlighting how DRL can handle nonlinearity, uncertainty, and scale in one- and multi-hop topologies. The paper emphasizes hybrid approaches that combine traditional optimization with MADRL, analyzes system-level challenges (training efficiency, non-stationarity, data requirements), and outlines open issues and future directions for robust, real-time UAV DRL deployments. Overall, this survey provides a practical roadmap for designing DRL-based UAV networking solutions with attention to scalability, collaboration, and adaptability in evolving 5/6G environments.

Abstract

Unmanned aerial vehicles (UAVs) are playing an increasingly pivotal role in modern communication networks,offering flexibility and enhanced coverage for a variety of applica-tions. However, UAV networks pose significant challenges due to their dynamic and distributed nature, particularly when dealing with tasks such as power allocation, channel assignment, caching,and task offloading. Traditional optimization techniques often struggle to handle the complexity and unpredictability of these environments, leading to suboptimal performance. This survey provides a comprehensive examination of how deep reinforcement learning (DRL) can be applied to solve these mathematical optimization problems in UAV communications and networking.Rather than simply introducing DRL methods, the focus is on demonstrating how these methods can be utilized to solve complex mathematical models of the underlying problems. We begin by reviewing the fundamental concepts of DRL, including value-based, policy-based, and actor-critic approaches. Then,we illustrate how DRL algorithms are applied to specific UAV network tasks by discussing from problem formulations to DRL implementation. By framing UAV communication challenges as optimization problems, this survey emphasizes the practical value of DRL in dynamic and uncertain environments. We also explore the strengths of DRL in handling large-scale network scenarios and the ability to continuously adapt to changes in the environment. In addition, future research directions are outlined, highlighting the potential for DRL to further enhance UAV communications and expand its applicability to more complex,multi-agent settings.

A Survey on DRL based UAV Communications and Networking: DRL Fundamentals, Applications and Implementations

TL;DR

The surveyed work addresses the challenge of optimizing UAV communications in dynamic, multi-agent wireless networks by framing key problems (power allocation, channel assignment, caching, and task offloading) as optimization models solvable with deep reinforcement learning. It systematically reviews DRL fundamentals (MDP, value-based, policy-based, actor-critic, and multi-agent variants) and maps them to UAV applications, highlighting how DRL can handle nonlinearity, uncertainty, and scale in one- and multi-hop topologies. The paper emphasizes hybrid approaches that combine traditional optimization with MADRL, analyzes system-level challenges (training efficiency, non-stationarity, data requirements), and outlines open issues and future directions for robust, real-time UAV DRL deployments. Overall, this survey provides a practical roadmap for designing DRL-based UAV networking solutions with attention to scalability, collaboration, and adaptability in evolving 5/6G environments.

Abstract

Unmanned aerial vehicles (UAVs) are playing an increasingly pivotal role in modern communication networks,offering flexibility and enhanced coverage for a variety of applica-tions. However, UAV networks pose significant challenges due to their dynamic and distributed nature, particularly when dealing with tasks such as power allocation, channel assignment, caching,and task offloading. Traditional optimization techniques often struggle to handle the complexity and unpredictability of these environments, leading to suboptimal performance. This survey provides a comprehensive examination of how deep reinforcement learning (DRL) can be applied to solve these mathematical optimization problems in UAV communications and networking.Rather than simply introducing DRL methods, the focus is on demonstrating how these methods can be utilized to solve complex mathematical models of the underlying problems. We begin by reviewing the fundamental concepts of DRL, including value-based, policy-based, and actor-critic approaches. Then,we illustrate how DRL algorithms are applied to specific UAV network tasks by discussing from problem formulations to DRL implementation. By framing UAV communication challenges as optimization problems, this survey emphasizes the practical value of DRL in dynamic and uncertain environments. We also explore the strengths of DRL in handling large-scale network scenarios and the ability to continuously adapt to changes in the environment. In addition, future research directions are outlined, highlighting the potential for DRL to further enhance UAV communications and expand its applicability to more complex,multi-agent settings.

Paper Structure

This paper contains 57 sections, 49 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Process of agent and environment interaction.
  • Figure 2: The taxonomy of DRL algorithms. Policy-based methods are not strictly devoid of critics, but policy-based approaches focus more on policy optimization, while Actor Critic methods emphasize systematic optimization, involving more efficient and parallel sampling.
  • Figure 3: Different frameworks for MARL problems.
  • Figure 4: Power Allocation in different scenarios.
  • Figure 5: Channel assignment in different scenarios.
  • ...and 1 more figures