Table of Contents
Fetching ...

Deep Reinforcement Learning Enabled Persistent Surveillance with Energy-Aware UAV-UGV Systems for Disaster Management Applications

Md Safwan Mondal, Subramanian Ramasamy, Pranav Bhounsule

TL;DR

This paper tackles energy-constrained, persistent disaster surveillance using a UAV-UGV cooperative system where a mobile UGV refuels a UAV. It introduces a transformer-based DRL policy trained with REINFORCE to jointly plan UAV and UGV routes, including rendezvous points, under an open-ended EVRPTW-like constraint and a time-age based objective. The approach is evaluated against heuristic baselines and a learning-based model across varied problem sizes, distributions, and a Hurricane Harvey 2017 case study, showing superior solution quality, scalability, and adaptability to dynamic changes and priority weighting. The work demonstrates notable improvements in visit frequency of mission points and robust online planning, highlighting the practical potential for real-time disaster response using energy-aware UAV-UGV cooperation.

Abstract

Integrating Unmanned Aerial Vehicles (UAVs) with Unmanned Ground Vehicles (UGVs) provides an effective solution for persistent surveillance in disaster management. UAVs excel at covering large areas rapidly, but their range is limited by battery capacity. UGVs, though slower, can carry larger batteries for extended missions. By using UGVs as mobile recharging stations, UAVs can extend mission duration through periodic refueling, leveraging the complementary strengths of both systems. To optimize this energy-aware UAV-UGV cooperative routing problem, we propose a planning framework that determines optimal routes and recharging points between a UAV and a UGV. Our solution employs a deep reinforcement learning (DRL) framework built on an encoder-decoder transformer architecture with multi-head attention mechanisms. This architecture enables the model to sequentially select actions for visiting mission points and coordinating recharging rendezvous between the UAV and UGV. The DRL model is trained to minimize the age periods (the time gap between consecutive visits) of mission points, ensuring effective surveillance. We evaluate the framework across various problem sizes and distributions, comparing its performance against heuristic methods and an existing learning-based model. Results show that our approach consistently outperforms these baselines in both solution quality and runtime. Additionally, we demonstrate the DRL policy's applicability in a real-world disaster scenario as a case study and explore its potential for online mission planning to handle dynamic changes. Adapting the DRL policy for priority-driven surveillance highlights the model's generalizability for real-time disaster response.

Deep Reinforcement Learning Enabled Persistent Surveillance with Energy-Aware UAV-UGV Systems for Disaster Management Applications

TL;DR

This paper tackles energy-constrained, persistent disaster surveillance using a UAV-UGV cooperative system where a mobile UGV refuels a UAV. It introduces a transformer-based DRL policy trained with REINFORCE to jointly plan UAV and UGV routes, including rendezvous points, under an open-ended EVRPTW-like constraint and a time-age based objective. The approach is evaluated against heuristic baselines and a learning-based model across varied problem sizes, distributions, and a Hurricane Harvey 2017 case study, showing superior solution quality, scalability, and adaptability to dynamic changes and priority weighting. The work demonstrates notable improvements in visit frequency of mission points and robust online planning, highlighting the practical potential for real-time disaster response using energy-aware UAV-UGV cooperation.

Abstract

Integrating Unmanned Aerial Vehicles (UAVs) with Unmanned Ground Vehicles (UGVs) provides an effective solution for persistent surveillance in disaster management. UAVs excel at covering large areas rapidly, but their range is limited by battery capacity. UGVs, though slower, can carry larger batteries for extended missions. By using UGVs as mobile recharging stations, UAVs can extend mission duration through periodic refueling, leveraging the complementary strengths of both systems. To optimize this energy-aware UAV-UGV cooperative routing problem, we propose a planning framework that determines optimal routes and recharging points between a UAV and a UGV. Our solution employs a deep reinforcement learning (DRL) framework built on an encoder-decoder transformer architecture with multi-head attention mechanisms. This architecture enables the model to sequentially select actions for visiting mission points and coordinating recharging rendezvous between the UAV and UGV. The DRL model is trained to minimize the age periods (the time gap between consecutive visits) of mission points, ensuring effective surveillance. We evaluate the framework across various problem sizes and distributions, comparing its performance against heuristic methods and an existing learning-based model. Results show that our approach consistently outperforms these baselines in both solution quality and runtime. Additionally, we demonstrate the DRL policy's applicability in a real-world disaster scenario as a case study and explore its potential for online mission planning to handle dynamic changes. Adapting the DRL policy for priority-driven surveillance highlights the model's generalizability for real-time disaster response.

Paper Structure

This paper contains 27 sections, 26 equations, 11 figures, 4 tables, 1 algorithm.

Figures (11)

  • Figure 1: Illustration of collaboration between an energy-constrained UAV and a UGV for surveying disaster-stricken areas. The UAV performs continuous surveillance and recharges through the UGV. The proposed DRL policy determines mission point visits and coordinates UAV-UGV recharging.
  • Figure 2: Bilevel optimization workflow: a) Given scenario with UAV and ground points, and UGV's traversal direction along the road network from the starting depot as obtained from the TSP solution. b) Available refuel stops provided by the UGV during O-EVRPTW route planning for the UAV. c) Recharging rendezvous between the UAV and UGV, along with their respective route sorties, derived from the O-EVRPTW solution.
  • Figure 3: MDP representation for the UAV-UGV cooperative persistent surveillance problem utilizing a Transformer architecture.
  • Figure 4: Architecture of the proposed Transformer network. The encoder consists of three attention layers that generate input embeddings from raw data, while the decoder constructs a context vector based on the current state. The network leverages both input embeddings and the context vector, passing them through multi-head and single-head attention layers to determine the next action, sequentially forming the cooperative route for persistent surveillance, as depicted.
  • Figure 5: Training reward curves across different problem sizes under DRL and AM policies.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Definition 1
  • Definition 2