Meta Reinforcement Learning for Strategic IoT Deployments Coverage in Disaster-Response UAV Swarms

Marwan Dhuheir; Aiman Erbad; Ala Al-Fuqaha

Meta Reinforcement Learning for Strategic IoT Deployments Coverage in Disaster-Response UAV Swarms

Marwan Dhuheir, Aiman Erbad, Ala Al-Fuqaha

TL;DR

The paper addresses energy-efficient path planning for a dynamic swarm of UAVs tasked with data collection from ground IoT devices in disaster scenarios, emphasizing strategic-location coverage. It formulates an NP-hard optimization problem to minimize total energy while meeting minimum data-rate and time constraints, and proposes a lightweight meta-reinforcement learning solution to enable fast adaptation when UAVs join or leave the swarm. The approach models a detailed wireless channel (LoS/NLoS) and data-delivery delays, and defines an energy model that accounts for operational and communication power plus strategic-location passes. Through simulations in an urban grid, the Meta-RL method outperforms PPO, Actor-Critic, and DQN baselines in terms of faster convergence, higher service satisfaction at strategic locations, and adaptive resilience to swarm dynamics. The work provides a practical framework for robust, energy-conscious IoT data collection in rapidly changing disaster-response environments, with potential impact on real deployments and emergency communications.

Abstract

In the past decade, Unmanned Aerial Vehicles (UAVs) have grabbed the attention of researchers in academia and industry for their potential use in critical emergency applications, such as providing wireless services to ground users and collecting data from areas affected by disasters, due to their advantages in terms of maneuverability and movement flexibility. The UAVs' limited resources, energy budget, and strict mission completion time have posed challenges in adopting UAVs for these applications. Our system model considers a UAV swarm that navigates an area collecting data from ground IoT devices focusing on providing better service for strategic locations and allowing UAVs to join and leave the swarm (e.g., for recharging) in a dynamic way. In this work, we introduce an optimization model with the aim of minimizing the total energy consumption and provide the optimal path planning of UAVs under the constraints of minimum completion time and transmit power. The formulated optimization is NP-hard making it not applicable for real-time decision making. Therefore, we introduce a light-weight meta-reinforcement learning solution that can also cope with sudden changes in the environment through fast convergence. We conduct extensive simulations and compare our approach to three state-of-the-art learning models. Our simulation results prove that our introduced approach is better than the three state-of-the-art algorithms in providing coverage to strategic locations with fast convergence.

Meta Reinforcement Learning for Strategic IoT Deployments Coverage in Disaster-Response UAV Swarms

TL;DR

Abstract

Paper Structure (9 sections, 18 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 9 sections, 18 equations, 4 figures, 1 table, 1 algorithm.

introduction
system model
Wireless Channel Model
Device-to-Device (D2D) Time Delay Model
Energy Consumption Model
problem formulation
Meta-reinforcement learning for efficient energy consumption and path planning
simulation results and analysis
conclusion

Figures (4)

Figure 1: System Model for multi-UAVs covering an area with strategic locations. The UAVs mission is data collection from ground devices.
Figure 2: The number of visitations to strategic locations in one time frame $T$.
Figure 3: Adaptivity of Meta-RL algorithm to the environment changes of the learning. The algorithm started with 4 UAVs, then 1 more UAV joined the swarm, and after that, 2 UAVs left the swarm. Meta-RL algorithm learns the optimal policy quickly and converges to its maximum expected reward.
Figure 4: Energy consumption of different algorithms in terms of strategic locations and no-strategic locations, convergence speed, and demand service satisfaction.

Meta Reinforcement Learning for Strategic IoT Deployments Coverage in Disaster-Response UAV Swarms

TL;DR

Abstract

Meta Reinforcement Learning for Strategic IoT Deployments Coverage in Disaster-Response UAV Swarms

Authors

TL;DR

Abstract

Table of Contents

Figures (4)