MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning
Demetros Aschu, Robinroy Peter, Sausar Karaf, Aleksey Fedoseev, Dzmitry Tsetserukou
TL;DR
MARLander tackles the challenge of safe, precise swarm landings by learning decentralized control policies via multi-agent deep reinforcement learning. The method uses a PPO-based policy that relies on local observations from a two-drone system, trained in a PyBullet-based simulation with random 4×4×4 m setups and deployed on Crazyflie drones with Vicon indoor localization. Results show cm-scale landing accuracy on stationary targets (≈2.26–2.97 cm) and ≈3.93 cm on moving platforms, with high success rates (≈91.67% static, ≈75% moving), outperforming PID+APF baselines and single-agent RL. The work demonstrates scalable, decentralized coordination for drone swarms with potential impact on logistics, safety, and rescue missions.
Abstract
Achieving safe and precise landings for a swarm of drones poses a significant challenge, primarily attributed to conventional control and planning methods. This paper presents the implementation of multi-agent deep reinforcement learning (MADRL) techniques for the precise landing of a drone swarm at relocated target locations. The system is trained in a realistic simulated environment with a maximum velocity of 3 m/s in training spaces of 4 x 4 x 4 m and deployed utilizing Crazyflie drones with a Vicon indoor localization system. The experimental results revealed that the proposed approach achieved a landing accuracy of 2.26 cm on stationary and 3.93 cm on moving platforms surpassing a baseline method used with a Proportional-integral-derivative (PID) controller with an Artificial Potential Field (APF). This research highlights drone landing technologies that eliminate the need for analytical centralized systems, potentially offering scalability and revolutionizing applications in logistics, safety, and rescue missions.
