Multi-UAV Speed Control with Collision Avoidance and Handover-aware Cell Association: DRL with Action Branching

Zijiang Yan; Wael Jaafar; Bassant Selim; Hina Tabassum

Multi-UAV Speed Control with Collision Avoidance and Handover-aware Cell Association: DRL with Action Branching

Zijiang Yan, Wael Jaafar, Bassant Selim, Hina Tabassum

TL;DR

This work tackles the challenge of jointly optimizing multi-UAV speed control and cell association on a 3D aerial highway to improve traffic flow, connectivity, and handover handling. It introduces a branching deep Q-network framework (BDQ/BDDQN) that decomposes the high-dimensional action space into transportation and telecommunication branches, leveraging a shared representation to coordinate decisions. The approach yields improved transportation and communication performance, reduces handover rates (with BDDQN achieving sub-1% HO rates after extensive training at moderate speeds), and demonstrates an effective trade-off when varying the number of available BSs. The study advances practical deployment of cooperative multi-UAV networks by incorporating collision avoidance, lane-changing dynamics, and HO-aware data-rate optimization.

Abstract

This paper presents a deep reinforcement learning solution for optimizing multi-UAV cell-association decisions and their moving velocity on a 3D aerial highway. The objective is to enhance transportation and communication performance, including collision avoidance, connectivity, and handovers. The problem is formulated as a Markov decision process (MDP) with UAVs' states defined by velocities and communication data rates. We propose a neural architecture with a shared decision module and multiple network branches, each dedicated to a specific action dimension in a 2D transportation-communication space. This design efficiently handles the multi-dimensional action space, allowing independence for individual action dimensions. We introduce two models, Branching Dueling Q-Network (BDQ) and Branching Dueling Double Deep Q-Network (Dueling DDQN), to demonstrate the approach. Simulation results show a significant improvement of 18.32% compared to existing benchmarks.

Multi-UAV Speed Control with Collision Avoidance and Handover-aware Cell Association: DRL with Action Branching

TL;DR

Abstract

Paper Structure (17 sections, 17 equations, 5 figures, 1 algorithm)

This paper contains 17 sections, 17 equations, 5 figures, 1 algorithm.

Introduction
System Model
G2A Channel Model
BS's antenna gain
LoS probability
Path loss
Received Power and Achievable Data Rate Analysis
Handovers
Problem Formulation as MDP and Proposed DRL with Action Branching
Observation and State Space
Action Space
Reward Function Design
UAV Transportation Reward
UAV Communication Reward
Proposed Branching Dueling Q-Network-based Methods
...and 2 more sections

Figures (5)

Figure 1: Illustration of the proposed aerial network model (top view). Blue circles represent BSs; Solid/dash lines represent desired/interference link.
Figure 2: Proposed action branching architecture. Shared module computes a latent representation of the input state and passes it forward to action branches.
Figure 3: UAVs performances ($15$ BSs, different $v$): (a) Total transportation reward (b) Total communication reward (c) HO rate (BDDQN).
Figure 4: UAVs performances ($15$ BSs, $v=10$ m/s): (a) Avg. transportation reward (b) Avg. communication reward (c) Avg. HO rate.
Figure 5: UAVs performances ($v=10$ m/s): (a) Avg. transportation reward (b) Avg. communication reward (c) Avg. HO rate (BDDQN).

Multi-UAV Speed Control with Collision Avoidance and Handover-aware Cell Association: DRL with Action Branching

TL;DR

Abstract

Multi-UAV Speed Control with Collision Avoidance and Handover-aware Cell Association: DRL with Action Branching

Authors

TL;DR

Abstract

Table of Contents

Figures (5)