MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning

Demetros Aschu; Robinroy Peter; Sausar Karaf; Aleksey Fedoseev; Dzmitry Tsetserukou

MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning

Demetros Aschu, Robinroy Peter, Sausar Karaf, Aleksey Fedoseev, Dzmitry Tsetserukou

TL;DR

MARLander tackles the challenge of safe, precise swarm landings by learning decentralized control policies via multi-agent deep reinforcement learning. The method uses a PPO-based policy that relies on local observations from a two-drone system, trained in a PyBullet-based simulation with random 4×4×4 m setups and deployed on Crazyflie drones with Vicon indoor localization. Results show cm-scale landing accuracy on stationary targets (≈2.26–2.97 cm) and ≈3.93 cm on moving platforms, with high success rates (≈91.67% static, ≈75% moving), outperforming PID+APF baselines and single-agent RL. The work demonstrates scalable, decentralized coordination for drone swarms with potential impact on logistics, safety, and rescue missions.

Abstract

Achieving safe and precise landings for a swarm of drones poses a significant challenge, primarily attributed to conventional control and planning methods. This paper presents the implementation of multi-agent deep reinforcement learning (MADRL) techniques for the precise landing of a drone swarm at relocated target locations. The system is trained in a realistic simulated environment with a maximum velocity of 3 m/s in training spaces of 4 x 4 x 4 m and deployed utilizing Crazyflie drones with a Vicon indoor localization system. The experimental results revealed that the proposed approach achieved a landing accuracy of 2.26 cm on stationary and 3.93 cm on moving platforms surpassing a baseline method used with a Proportional-integral-derivative (PID) controller with an Artificial Potential Field (APF). This research highlights drone landing technologies that eliminate the need for analytical centralized systems, potentially offering scalability and revolutionizing applications in logistics, safety, and rescue missions.

MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (26 sections, 6 equations, 8 figures, 2 tables)

This paper contains 26 sections, 6 equations, 8 figures, 2 tables.

Introduction
Related Works
Methodology
System Overview
Problem Formulation and Preliminaries
Environment Setup
Observation and Actions
Reward Function
Model Architecture
Simulation Setup
Training Configuration
Experiments
Experimental Setup
MARLander experiment with stationary platform
Description
...and 11 more sections

Figures (8)

Figure 1: MARlander two drones landing on the target platform placed on the robot manipulator
Figure 2: General system overview of MARLander.
Figure 3: A neural network architecture for PPO algorithm
Figure 4: Gym PyBullet environment for simulating a MADRL-driven swarm of drones
Figure 5: Episode reward mean during training
...and 3 more figures

MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning

TL;DR

Abstract

MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (8)