Table of Contents
Fetching ...

Coordinated Autonomous Drones for Human-Centered Fire Evacuation in Partially Observable Urban Environments

Maria G. Mendoza, Addison Kalanther, Daniel Bostwick, Emma Stephan, Chinmay Maheshwari, Shankar Sastry

TL;DR

This paper addresses real-time, human-centered fire evacuation in partially observable urban environments by coordinating two heterogeneous UAVs (HLR and LLR) to locate and guide evacuees under panic. It combines an agent-based panic model with a POMDP formulation and trains a centralized, recurrent policy via PPO to handle long-horizon planning and partial observability. The reward structure emphasizes visibility, proximity, and successful capture, and experiments demonstrate significant reductions in time to safety and robust performance across varied initial conditions, while highlighting remaining limitations and avenues for scaling to multiple evacuees. The work offers a practical, autonomous framework that could augment emergency response in low-resource or high-risk settings, informing deployment strategies for urban disaster relief.

Abstract

Autonomous drone technology holds significant promise for enhancing search and rescue operations during evacuations by guiding humans toward safety and supporting broader emergency response efforts. However, their application in dynamic, real-time evacuation support remains limited. Existing models often overlook the psychological and emotional complexity of human behavior under extreme stress. In real-world fire scenarios, evacuees frequently deviate from designated safe routes due to panic and uncertainty. To address these challenges, this paper presents a multi-agent coordination framework in which autonomous Unmanned Aerial Vehicles (UAVs) assist human evacuees in real-time by locating, intercepting, and guiding them to safety under uncertain conditions. We model the problem as a Partially Observable Markov Decision Process (POMDP), where two heterogeneous UAV agents, a high-level rescuer (HLR) and a low-level rescuer (LLR), coordinate through shared observations and complementary capabilities. Human behavior is captured using an agent-based model grounded in empirical psychology, where panic dynamically affects decision-making and movement in response to environmental stimuli. The environment features stochastic fire spread, unknown evacuee locations, and limited visibility, requiring UAVs to plan over long horizons to search for humans and adapt in real-time. Our framework employs the Proximal Policy Optimization (PPO) algorithm with recurrent policies to enable robust decision-making in partially observable settings. Simulation results demonstrate that the UAV team can rapidly locate and intercept evacuees, significantly reducing the time required for them to reach safety compared to scenarios without UAV assistance.

Coordinated Autonomous Drones for Human-Centered Fire Evacuation in Partially Observable Urban Environments

TL;DR

This paper addresses real-time, human-centered fire evacuation in partially observable urban environments by coordinating two heterogeneous UAVs (HLR and LLR) to locate and guide evacuees under panic. It combines an agent-based panic model with a POMDP formulation and trains a centralized, recurrent policy via PPO to handle long-horizon planning and partial observability. The reward structure emphasizes visibility, proximity, and successful capture, and experiments demonstrate significant reductions in time to safety and robust performance across varied initial conditions, while highlighting remaining limitations and avenues for scaling to multiple evacuees. The work offers a practical, autonomous framework that could augment emergency response in low-resource or high-risk settings, informing deployment strategies for urban disaster relief.

Abstract

Autonomous drone technology holds significant promise for enhancing search and rescue operations during evacuations by guiding humans toward safety and supporting broader emergency response efforts. However, their application in dynamic, real-time evacuation support remains limited. Existing models often overlook the psychological and emotional complexity of human behavior under extreme stress. In real-world fire scenarios, evacuees frequently deviate from designated safe routes due to panic and uncertainty. To address these challenges, this paper presents a multi-agent coordination framework in which autonomous Unmanned Aerial Vehicles (UAVs) assist human evacuees in real-time by locating, intercepting, and guiding them to safety under uncertain conditions. We model the problem as a Partially Observable Markov Decision Process (POMDP), where two heterogeneous UAV agents, a high-level rescuer (HLR) and a low-level rescuer (LLR), coordinate through shared observations and complementary capabilities. Human behavior is captured using an agent-based model grounded in empirical psychology, where panic dynamically affects decision-making and movement in response to environmental stimuli. The environment features stochastic fire spread, unknown evacuee locations, and limited visibility, requiring UAVs to plan over long horizons to search for humans and adapt in real-time. Our framework employs the Proximal Policy Optimization (PPO) algorithm with recurrent policies to enable robust decision-making in partially observable settings. Simulation results demonstrate that the UAV team can rapidly locate and intercept evacuees, significantly reducing the time required for them to reach safety compared to scenarios without UAV assistance.

Paper Structure

This paper contains 14 sections, 13 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Left: A 3D urban environment from a top-down perspective illustrating a disaster evacuation scenario involving a human evacuee, fire, and two UAV rescuers. Right: The same environment is modeled as a 2D grid-based world, where each cell is annotated with accessibility and visibility properties.
  • Figure 2: Number of steps taken by the evacuee to reach the safe zone under varying panic levels
  • Figure 3: Time to capture vs. percent improvement in evacuation time
  • Figure 4: Impact of $r$ on evacuation outcomes. The left axis shows the percentage of evacuees successfully intercepted by UAV rescuers. The right axis shows the average percent improvement in capture time relative to the panic-only baseline.