Table of Contents
Fetching ...

Reinforcement Learning Based Escape Route Generation in Low Visibility Environments

Hari Srikanth

TL;DR

The paper tackles rapid, safe evacuation in low-visibility structure fires by real-time mapping and route planning. It combines a fleet of drones with LiDAR and sonar, a trust-range-based data fusion, and RANSAC-style map unification to produce a merged map; this is then converted into a visibility graph with danger labeling to form an environment tensor for planning. A Linear Function Approximation-based Natural Policy Gradient (LFA-NPG) RL method is shown to be robust to adversarial noise and to converge faster than deep-policy methods in low-dimensional settings. Two RL agents, savior and refugee, operate on the same state space to generate rescue and escape routes, enabling real-time decision support for firefighters and civilians in burning buildings.

Abstract

Structure fires are responsible for the majority of fire-related deaths nationwide. In order to assist with the rapid evacuation of trapped people, this paper proposes the use of a system that determines optimal search paths for firefighters and exit paths for civilians in real time based on environmental measurements. Through the use of a LiDAR mapping system evaluated and verified by a trust range derived from sonar and smoke concentration data, a proposed solution to low visibility mapping is tested. These independent point clouds are then used to create distinct maps, which are merged through the use of a RANSAC based alignment methodology and simplified into a visibility graph. Temperature and humidity data are then used to label each node with a danger score, creating an environment tensor. After demonstrating how a Linear Function Approximation based Natural Policy Gradient RL methodology outperforms more complex competitors with respect to robustness and speed, this paper outlines two systems (savior and refugee) that process the environment tensor to create safe rescue and escape routes, respectively.

Reinforcement Learning Based Escape Route Generation in Low Visibility Environments

TL;DR

The paper tackles rapid, safe evacuation in low-visibility structure fires by real-time mapping and route planning. It combines a fleet of drones with LiDAR and sonar, a trust-range-based data fusion, and RANSAC-style map unification to produce a merged map; this is then converted into a visibility graph with danger labeling to form an environment tensor for planning. A Linear Function Approximation-based Natural Policy Gradient (LFA-NPG) RL method is shown to be robust to adversarial noise and to converge faster than deep-policy methods in low-dimensional settings. Two RL agents, savior and refugee, operate on the same state space to generate rescue and escape routes, enabling real-time decision support for firefighters and civilians in burning buildings.

Abstract

Structure fires are responsible for the majority of fire-related deaths nationwide. In order to assist with the rapid evacuation of trapped people, this paper proposes the use of a system that determines optimal search paths for firefighters and exit paths for civilians in real time based on environmental measurements. Through the use of a LiDAR mapping system evaluated and verified by a trust range derived from sonar and smoke concentration data, a proposed solution to low visibility mapping is tested. These independent point clouds are then used to create distinct maps, which are merged through the use of a RANSAC based alignment methodology and simplified into a visibility graph. Temperature and humidity data are then used to label each node with a danger score, creating an environment tensor. After demonstrating how a Linear Function Approximation based Natural Policy Gradient RL methodology outperforms more complex competitors with respect to robustness and speed, this paper outlines two systems (savior and refugee) that process the environment tensor to create safe rescue and escape routes, respectively.
Paper Structure (6 sections, 3 figures)

This paper contains 6 sections, 3 figures.

Figures (3)

  • Figure 1: System Organization
  • Figure 2: LFA-NPG Robustness Analysis
  • Figure 3: LFA-NPG Model Convergence w.r.t Iterations & Time