Table of Contents
Fetching ...

Navigation in a Three-Dimensional Urban Flow using Deep Reinforcement Learning

Federica Tonti, Ricardo Vinuesa

TL;DR

The algorithm presented here is a flow-aware Proximal Policy Optimization combined with a Gated Transformer eXtra Large (GTrXL) architecture, giving the agent richer information about the turbulent flow field in which it navigates, paving the way to a completely reimagined UAV landscape in complex urban environments.

Abstract

Unmanned Aerial Vehicles (UAVs) are increasingly populating urban areas for delivery and surveillance purposes. In this work, we develop an optimal navigation strategy based on Deep Reinforcement Learning. The environment is represented by a three-dimensional high-fidelity simulation of an urban flow, characterized by turbulence and recirculation zones. The algorithm presented here is a flow-aware Proximal Policy Optimization (PPO) combined with a Gated Transformer eXtra Large (GTrXL) architecture, giving the agent richer information about the turbulent flow field in which it navigates. The results are compared with a PPO+GTrXL without the secondary prediction tasks, a PPO combined with Long Short Term Memory (LSTM) cells and a traditional navigation algorithm. The obtained results show a significant increase in the success rate (SR) and a lower crash rate (CR) compared to a PPO+LSTM, PPO+GTrXL and the classical Zermelo's navigation algorithm, paving the way to a completely reimagined UAV landscape in complex urban environments.

Navigation in a Three-Dimensional Urban Flow using Deep Reinforcement Learning

TL;DR

The algorithm presented here is a flow-aware Proximal Policy Optimization combined with a Gated Transformer eXtra Large (GTrXL) architecture, giving the agent richer information about the turbulent flow field in which it navigates, paving the way to a completely reimagined UAV landscape in complex urban environments.

Abstract

Unmanned Aerial Vehicles (UAVs) are increasingly populating urban areas for delivery and surveillance purposes. In this work, we develop an optimal navigation strategy based on Deep Reinforcement Learning. The environment is represented by a three-dimensional high-fidelity simulation of an urban flow, characterized by turbulence and recirculation zones. The algorithm presented here is a flow-aware Proximal Policy Optimization (PPO) combined with a Gated Transformer eXtra Large (GTrXL) architecture, giving the agent richer information about the turbulent flow field in which it navigates. The results are compared with a PPO+GTrXL without the secondary prediction tasks, a PPO combined with Long Short Term Memory (LSTM) cells and a traditional navigation algorithm. The obtained results show a significant increase in the success rate (SR) and a lower crash rate (CR) compared to a PPO+LSTM, PPO+GTrXL and the classical Zermelo's navigation algorithm, paving the way to a completely reimagined UAV landscape in complex urban environments.

Paper Structure

This paper contains 14 sections, 14 equations, 4 figures, 1 table, 3 algorithms.

Figures (4)

  • Figure 1: Evolution of the reward (\ref{['fig:reward']}), success rate (\ref{['fig:success']}) and crash rate (\ref{['fig:crash']}) vs. training iterations for PPO+LSTM (blue), PPO+GTrXL (orange), and Flow-aware PPO+GTrXL (green).
  • Figure 2: Visualization of trajectories produced by the trained policy of the flow-aware PPO+GTrXL algorithm.
  • Figure 3: Sketch of sensors' rays for elevation (\ref{['fig:elevation']}) and azimuth (\ref{['fig:azimuth']}). We also show a representation of the obstacle-detection method (\ref{['fig:obstacledetection']}).
  • Figure 4: PPO+GTrXL (\ref{['fig:ppogtrxl']}), GTrXL block (\ref{['fig:gtrxl']}) and sketch of the modified algorithm which includes temporal and spatial information of the flow field (\ref{['fig:sketch_algo_cnn']})