UAV-Assisted Space-Air-Ground Integrated Networks: A Technical Review of Recent Learning Algorithms
Atefeh H. Arani, Peng Hu, Yeying Zhu
TL;DR
This paper addresses the challenge of optimizing UAV-assisted SAGINs by surveying learning-based approaches for joint 3D trajectory design and channel/resource allocation. It systematically reviews Q-learning, MAB, DRL (DQN), satisfaction-based learning, and PSO within a unified SAGIN model that includes 3D UAV placements, backhaul via LEO satellites, and QoE-oriented objectives, formalized as $\Gamma_b(t)=\Phi_b\mathcal{F}(t)+\psi_b(1-\rho_b(t))$ with Jain’s fairness index $\mathcal{F}(t)=\frac{(\sum_k\bar{C}_k)^2}{|\mathcal{K}|\sum_k\bar{C}_k^2}$. The authors show, through simulations, that 3D satisfaction-based learning with channel allocation frequently outperforms alternatives in outage, load balance, and fairness, while convergence and runtime vary across methods. The work provides design and deployment guidelines for UAV-enabled SAGINs and highlights open challenges including UAV collaboration, multi-hop networking, altitude/speed variability, and hybrid DRL-PSO strategies. Overall, the paper offers a comprehensive, algorithm-focused roadmap for optimizing complex SAGIN deployments with practical QoE considerations, and identifies promising directions for integrating advanced DRL techniques such as DDPG and PPO in continuous-action settings.
Abstract
Recent technological advancements in space, air, and ground components have made possible a new network paradigm called space-air-ground integrated network (SAGIN). Unmanned aerial vehicles (UAVs) play a key role in SAGINs. However, due to UAVs' high dynamics and complexity, real-world deployment of a SAGIN becomes a significant barrier to realizing such SAGINs. UAVs are expected to meet key performance requirements with limited maneuverability and resources with space and terrestrial components. Therefore, employing UAVs in various usage scenarios requires well-designed planning in algorithmic approaches. This paper provides an essential review and analysis of recent learning algorithms in a UAV-assisted SAGIN. We consider possible reward functions and discuss the state-of-the-art algorithms for optimizing the reward functions, including Q-learning, deep Q-learning, multi-armed bandit, particle swarm optimization, and satisfaction-based learning algorithms. Unlike other survey papers, we focus on the methodological perspective of the optimization problem, applicable to various missions on a SAGIN. We consider real-world configurations and the 2-dimensional (2D) and 3-dimensional (3D) UAV trajectories to reflect deployment cases. Our simulations suggest the 3D satisfaction-based learning algorithm outperforms other approaches in most cases. With open challenges discussed at the end, we aim to provide design and deployment guidelines for UAV-assisted SAGINs.
