UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning

Saichao Liu; Geng Sun; Jiahui Li; Shuang Liang; Qingqing Wu; Pengfei Wang; Dusit Niyato

UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning

Saichao Liu, Geng Sun, Jiahui Li, Shuang Liang, Qingqing Wu, Pengfei Wang, Dusit Niyato

TL;DR

The paper tackles joint optimization of UAV positions and excitation weights for UAV-enabled collaborative beamforming toward base stations, aiming to maximize the UVAA transmission rate while minimizing motion energy. It models the problem as a multi-agent Markov game and introduces HATRPO-UCB, an improved heterogeneous-agent trust region policy optimization algorithm with three enhancements: observation augmentation, agent-specific global state, and Beta-distributed policy to handle bounded actions. Across simulations, HATRPO-UCB demonstrates faster convergence and better energy-rate tradeoffs than baselines, with ablation confirming the value of each enhancement. The approach offers a scalable, real-time framework for energy-efficient, CB-enabled UAV swarms in dynamic A2G networks.

Abstract

In this paper, we investigate an unmanned aerial vehicle (UAV)-assistant air-to-ground communication system, where multiple UAVs form a UAV-enabled virtual antenna array (UVAA) to communicate with remote base stations by utilizing collaborative beamforming. To improve the work efficiency of the UVAA, we formulate a UAV-enabled collaborative beamforming multi-objective optimization problem (UCBMOP) to simultaneously maximize the transmission rate of the UVAA and minimize the energy consumption of all UAVs by optimizing the positions and excitation current weights of all UAVs. This problem is challenging because these two optimization objectives conflict with each other, and they are non-concave to the optimization variables. Moreover, the system is dynamic, and the cooperation among UAVs is complex, making traditional methods take much time to compute the optimization solution for a single task. In addition, as the task changes, the previously obtained solution will become obsolete and invalid. To handle these issues, we leverage the multi-agent deep reinforcement learning (MADRL) to address the UCBMOP. Specifically, we use the heterogeneous-agent trust region policy optimization (HATRPO) as the basic framework, and then propose an improved HATRPO algorithm, namely HATRPO-UCB, where three techniques are introduced to enhance the performance. Simulation results demonstrate that the proposed algorithm can learn a better strategy compared with other methods. Moreover, extensive experiments also demonstrate the effectiveness of the proposed techniques.

UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (28 sections, 24 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 28 sections, 24 equations, 11 figures, 3 tables, 1 algorithm.

Introduction
Introduction
Related Work
System Model and Problem Formulation
UVAA Communication Model
UAV Energy Consumption Model
Problem Formulation
MADRL-Based UAV-enabled Collaborative Beamforming
Markov Game for UCBMOP
State and Observation Space
Action Space
Reward Function
HATRPO-UCB Algorithm
Conventional HATRPO
Algorithm Design of HATRPO-UCB
...and 13 more sections

Figures (11)

Figure 1: Sketch map of the considered UVAA communication system.
Figure 2: Energy consumption of UAV in the horizontal and vertical flights under different speeds.
Figure 3: The transmission rate of UVAA to BSs at different distances when performing CB communication. The parameters about LoS probability $C$ and $D$ are set as 10 and 0.6, respectively.
Figure 4: Flowchart of HATRPO-UCB algorithm for UAV-enabled CB.
Figure 5: Convergence performance of different methods.
...and 6 more figures

Theorems & Definitions (3)

Remark 1
Remark 2
Remark 3

UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning

TL;DR

Abstract

UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (11)

Theorems & Definitions (3)