Table of Contents
Fetching ...

Deep Reinforcement Learning based Autonomous Decision-Making for Cooperative UAVs: A Search and Rescue Real World Application

Thomas Hickling, Maxwell Hogan, Abdulla Tammam, Nabil Aouf

TL;DR

This work tackles autonomous decision-making and coordination for cooperative UAVs operating in GNSS-denied indoor spaces for search-and-rescue. It combines a TD3-based DRL guidance policy with an Artificial Potential Field–inspired reward to promote obstacle avoidance and smooth navigation, and a Graph Attention Network–based task allocator to enable dynamic, real-time multi-UAV coordination. Robust indoor odometry is achieved by integrating LIDAR-SLAM with depth sensing to mitigate hallway-induced height drift. Evaluations in simulation, real-world tests, and NATO Sapience competition scenarios demonstrate faster 3D mapping, reliable obstacle avoidance, and successful synchronized deliveries, culminating in a first-place finish and highlighting practical implications for SAR missions in challenging environments.

Abstract

This paper proposes a holistic framework for autonomous guidance, navigation, and task distribution among multi-drone systems operating in Global Navigation Satellite System (GNSS)-denied indoor settings. We advocate for a Deep Reinforcement Learning (DRL)-based guidance mechanism, utilising the Twin Delayed Deep Deterministic Policy Gradient algorithm. To improve the efficiency of the training process, we incorporate an Artificial Potential Field (APF)-based reward structure, enabling the agent to refine its movements, thereby promoting smoother paths and enhanced obstacle avoidance in indoor contexts. Furthermore, we tackle the issue of task distribution among cooperative UAVs through a DRL-trained Graph Convolutional Network (GCN). This GCN represents the interactions between drones and tasks, facilitating dynamic and real-time task allocation that reflects the current environmental conditions and the capabilities of the drones. Such an approach fosters effective coordination and collaboration among multiple drones during search and rescue operations or other exploratory endeavours. Lastly, to ensure precise odometry in environments lacking GNSS, we employ Light Detection And Ranging Simultaneous Localisation and Mapping complemented by a depth camera to mitigate the hallway problem. This integration offers robust localisation and mapping functionalities, thereby enhancing the systems dependability in indoor navigation. The proposed multi-drone framework not only elevates individual navigation capabilities but also optimises coordinated task allocation in complex, obstacle-laden environments. Experimental evaluations conducted in a setup tailored to meet the requirements of the NATO Sapience Autonomous Cooperative Drone Competition demonstrate the efficacy of the proposed system, yielding outstanding results and culminating in a first-place finish in the 2024 Sapience competition.

Deep Reinforcement Learning based Autonomous Decision-Making for Cooperative UAVs: A Search and Rescue Real World Application

TL;DR

This work tackles autonomous decision-making and coordination for cooperative UAVs operating in GNSS-denied indoor spaces for search-and-rescue. It combines a TD3-based DRL guidance policy with an Artificial Potential Field–inspired reward to promote obstacle avoidance and smooth navigation, and a Graph Attention Network–based task allocator to enable dynamic, real-time multi-UAV coordination. Robust indoor odometry is achieved by integrating LIDAR-SLAM with depth sensing to mitigate hallway-induced height drift. Evaluations in simulation, real-world tests, and NATO Sapience competition scenarios demonstrate faster 3D mapping, reliable obstacle avoidance, and successful synchronized deliveries, culminating in a first-place finish and highlighting practical implications for SAR missions in challenging environments.

Abstract

This paper proposes a holistic framework for autonomous guidance, navigation, and task distribution among multi-drone systems operating in Global Navigation Satellite System (GNSS)-denied indoor settings. We advocate for a Deep Reinforcement Learning (DRL)-based guidance mechanism, utilising the Twin Delayed Deep Deterministic Policy Gradient algorithm. To improve the efficiency of the training process, we incorporate an Artificial Potential Field (APF)-based reward structure, enabling the agent to refine its movements, thereby promoting smoother paths and enhanced obstacle avoidance in indoor contexts. Furthermore, we tackle the issue of task distribution among cooperative UAVs through a DRL-trained Graph Convolutional Network (GCN). This GCN represents the interactions between drones and tasks, facilitating dynamic and real-time task allocation that reflects the current environmental conditions and the capabilities of the drones. Such an approach fosters effective coordination and collaboration among multiple drones during search and rescue operations or other exploratory endeavours. Lastly, to ensure precise odometry in environments lacking GNSS, we employ Light Detection And Ranging Simultaneous Localisation and Mapping complemented by a depth camera to mitigate the hallway problem. This integration offers robust localisation and mapping functionalities, thereby enhancing the systems dependability in indoor navigation. The proposed multi-drone framework not only elevates individual navigation capabilities but also optimises coordinated task allocation in complex, obstacle-laden environments. Experimental evaluations conducted in a setup tailored to meet the requirements of the NATO Sapience Autonomous Cooperative Drone Competition demonstrate the efficacy of the proposed system, yielding outstanding results and culminating in a first-place finish in the 2024 Sapience competition.

Paper Structure

This paper contains 61 sections, 14 equations, 21 figures.

Figures (21)

  • Figure 1: The arena built for the Sapience competition as seen by the eight cameras used for monitoring the UAVs
  • Figure 2: The floor plan for the constructed building
  • Figure 3: City University's Autonomous Drone.
  • Figure 4: The architecture for the guidance AIs actor network
  • Figure 5: The TD3 Algorithm Architecture used for training the guidance AI
  • ...and 16 more figures