Learning with Dynamics: Autonomous Regulation of UAV Based Communication Networks with Dynamic UAV Crew
Ran Zhang, Bowei Li, Liyuan Zhang, Jiang, Xie, Miao Wang
TL;DR
This work addresses the challenge of regulating UAV-based communication networks under dynamically changing UAV crews by proposing RL-based strategies that operate in both reactive and proactive modes. It develops a comprehensive framework distinguishing centralized DRL and distributed MARL approaches, and introduces methods to handle mixed action spaces, exploration around crew changes, and robustness to varying fleet sizes. For solar-powered UCNs, the authors propose a two-subproblem decomposition and a centralized DRL solution to jointly optimize serving roles and charging profiles, further enriching this with game-theoretic MARL for hybrid cooperation-competition among UAVs. The practical impact lies in enabling autonomous, scalable regulation of UCNs in dynamic environments, with potential extensions to generative AI and wireless charging technologies to enhance resilience and efficiency.
Abstract
Unmanned Aerial Vehicle (UAV) based communication networks (UCNs) are a key component in future mobile networking. To handle the dynamic environments in UCNs, reinforcement learning (RL) has been a promising solution attributed to its strong capability of adaptive decision-making free of the environment models. However, most existing RL-based research focus on control strategy design assuming a fixed set of UAVs. Few works have investigated how UCNs should be adaptively regulated when the serving UAVs change dynamically. This article discusses RL-based strategy design for adaptive UCN regulation given a dynamic UAV set, addressing both reactive strategies in general UCNs and proactive strategies in solar-powered UCNs. An overview of the UCN and the RL framework is first provided. Potential research directions with key challenges and possible solutions are then elaborated. Some of our recent works are presented as case studies to inspire innovative ways to handle dynamic UAV crew with different RL algorithms.
