Distributed Autonomous Swarm Formation for Dynamic Network Bridging
Raffaele Galliera, Thies Möhlenhof, Alessandro Amato, Daniel Duran, Kristen Brent Venable, Niranjan Suri
TL;DR
This work addresses the problem of establishing a robust communication link between two moving targets using a swarm of agents in environments lacking reliable infrastructure. It introduces a Decentralized Partially Observable Markov Decision Process formulation and a Graph Convolutional Reinforcement Learning (DGN-inspired) MARL approach with message passing and latent neighborhood representations, trained in a fully cooperative setting with a shared Q-function. The reward is crafted from a base connectivity term, a centroid distance penalty, and a target-path bonus, encouraging both network cohesion and proximity to targets, and the method is tested in simulation with a live-to-sim transfer via a near-LVC UAV framework. Results show the learned agents can bridge the targets for most of the episode duration, though a centralized heuristic remains superior, indicating strong potential for further optimization and real-world deployment in disaster response and remote connectivity scenarios. $R_{ ext{base}}(s) = \frac{|C_{ ext{max}}(s)|}{|oldsymbol{ ext V}|}$, $P_{ ext{cent}}(s)$, and $B_{ ext{path}}=100$ define the reward components, while $R(s,a) = B_{ ext{path}}(s)$ if $ exists ext{path}(T_1,T_2)$ is false; otherwise $R_{ ext{base}}(s) - P_{ ext{cent}}(s)$. The study demonstrates effective sim-to-real transfer potential and lays groundwork for scalable, decentralized coordination in dynamic ad-hoc networks.
Abstract
Effective operation and seamless cooperation of robotic systems are a fundamental component of next-generation technologies and applications. In contexts such as disaster response, swarm operations require coordinated behavior and mobility control to be handled in a distributed manner, with the quality of the agents' actions heavily relying on the communication between them and the underlying network. In this paper, we formulate the problem of dynamic network bridging in a novel Decentralized Partially Observable Markov Decision Process (Dec-POMDP), where a swarm of agents cooperates to form a link between two distant moving targets. Furthermore, we propose a Multi-Agent Reinforcement Learning (MARL) approach for the problem based on Graph Convolutional Reinforcement Learning (DGN) which naturally applies to the networked, distributed nature of the task. The proposed method is evaluated in a simulated environment and compared to a centralized heuristic baseline showing promising results. Moreover, a further step in the direction of sim-to-real transfer is presented, by additionally evaluating the proposed approach in a near Live Virtual Constructive (LVC) UAV framework.
