Table of Contents
Fetching ...

Motion Control in Multi-Rotor Aerial Robots Using Deep Reinforcement Learning

Gaurav Shetty, Mahya Ramezani, Hamed Habibi, Holger Voos, Jose Luis Sanchez-Lopez

TL;DR

The paper addresses robust motion control for multi-rotor drones performing additive manufacturing under dynamic payloads and disturbances. It compares off-policy DRL algorithms—DDPG and TD3—within a curriculum learning framework for 3D waypoint navigation, using a detailed UAV-MDP and simulation in MATLAB/Simulink. TD3 consistently delivers higher stability, lower positional error, and higher success rates, especially when mass varies, and curriculum learning plus expanded observation space (including accelerations) further enhances performance. The work demonstrates a scalable path toward autonomous drone control in AM, with implications for large-scale, complex, and hazardous deployment environments.

Abstract

This paper investigates the application of Deep Reinforcement (DRL) Learning to address motion control challenges in drones for additive manufacturing (AM). Drone-based additive manufacturing promises flexible and autonomous material deposition in large-scale or hazardous environments. However, achieving robust real-time control of a multi-rotor aerial robot under varying payloads and potential disturbances remains challenging. Traditional controllers like PID often require frequent parameter re-tuning, limiting their applicability in dynamic scenarios. We propose a DRL framework that learns adaptable control policies for multi-rotor drones performing waypoint navigation in AM tasks. We compare Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic Policy Gradient (TD3) within a curriculum learning scheme designed to handle increasing complexity. Our experiments show TD3 consistently balances training stability, accuracy, and success, particularly when mass variability is introduced. These findings provide a scalable path toward robust, autonomous drone control in additive manufacturing.

Motion Control in Multi-Rotor Aerial Robots Using Deep Reinforcement Learning

TL;DR

The paper addresses robust motion control for multi-rotor drones performing additive manufacturing under dynamic payloads and disturbances. It compares off-policy DRL algorithms—DDPG and TD3—within a curriculum learning framework for 3D waypoint navigation, using a detailed UAV-MDP and simulation in MATLAB/Simulink. TD3 consistently delivers higher stability, lower positional error, and higher success rates, especially when mass varies, and curriculum learning plus expanded observation space (including accelerations) further enhances performance. The work demonstrates a scalable path toward autonomous drone control in AM, with implications for large-scale, complex, and hazardous deployment environments.

Abstract

This paper investigates the application of Deep Reinforcement (DRL) Learning to address motion control challenges in drones for additive manufacturing (AM). Drone-based additive manufacturing promises flexible and autonomous material deposition in large-scale or hazardous environments. However, achieving robust real-time control of a multi-rotor aerial robot under varying payloads and potential disturbances remains challenging. Traditional controllers like PID often require frequent parameter re-tuning, limiting their applicability in dynamic scenarios. We propose a DRL framework that learns adaptable control policies for multi-rotor drones performing waypoint navigation in AM tasks. We compare Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic Policy Gradient (TD3) within a curriculum learning scheme designed to handle increasing complexity. Our experiments show TD3 consistently balances training stability, accuracy, and success, particularly when mass variability is introduced. These findings provide a scalable path toward robust, autonomous drone control in additive manufacturing.

Paper Structure

This paper contains 12 sections, 9 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: 3D Scatter Plot of Reward Function
  • Figure 2: Simulink Model
  • Figure 3: Training Comparison of TD3 and DDPG
  • Figure 4: Test Results of TD3 Agent Trained with Different Methods
  • Figure 5: Test Results with Acceleration in Observation Space of TD3 Agent
  • ...and 1 more figures