Deep RL-based Autonomous Navigation of Micro Aerial Vehicles (MAVs) in a complex GPS-denied Indoor Environment
Amit Kumar Singh, Prasanth Kumar Duba, P. Rajalakshmi
TL;DR
This work tackles GPS-denied indoor MAV navigation by using a Deep-Proximal Policy Optimization framework that operates on monocular RGB images converted to depth. The method trains a CNN-based policy end-to-end in Unreal Engine-based simulations (AirSim) and validates on real hardware, including a DJI Tello, achieving substantial training-time reductions without sacrificing performance. The main contributions are the DPPO-based learning pipeline, monocular-depth-to-action control on a 7x7 grid, and real-world TiHAN testbed validation with notable gains in mean safe flight and real-time navigation. The results demonstrate practical viability for autonomous MAV operation in cluttered indoor environments and offer a path toward denser scenarios.
Abstract
The Autonomy of Unmanned Aerial Vehicles (UAVs) in indoor environments poses significant challenges due to the lack of reliable GPS signals in enclosed spaces such as warehouses, factories, and indoor facilities. Micro Aerial Vehicles (MAVs) are preferred for navigating in these complex, GPS-denied scenarios because of their agility, low power consumption, and limited computational capabilities. In this paper, we propose a Reinforcement Learning based Deep-Proximal Policy Optimization (D-PPO) algorithm to enhance realtime navigation through improving the computation efficiency. The end-to-end network is trained in 3D realistic meta-environments created using the Unreal Engine. With these trained meta-weights, the MAV system underwent extensive experimental trials in real-world indoor environments. The results indicate that the proposed method reduces computational latency by 91\% during training period without significant degradation in performance. The algorithm was tested on a DJI Tello drone, yielding similar results.
