Reinforcement Learning for Collision-free Flight Exploiting Deep Collision Encoding
Mihir Kulkarni, Kostas Alexis
TL;DR
The paper tackles collision-free autonomous flight for aerial robots in cluttered, GPS-denied environments without reliance on global maps. It introduces a modular pipeline where a deep collision encoder compresses depth images to a $64$-dimensional latent representation, which, together with odometry and target information, feeds a DRL policy trained with APPO to enable real-time navigation. Key contributions include task-driven depth compression with supervised training on both simulated and real data, a generalizable RL navigation policy, and extensive sim-to-real validation across simulation and real-world experiments with onboard low-latency inference. This approach demonstrates robust navigation in unseen clutter and under sensor noise, offering a scalable solution for robust autonomous flight in complex environments.
Abstract
This work contributes a novel deep navigation policy that enables collision-free flight of aerial robots based on a modular approach exploiting deep collision encoding and reinforcement learning. The proposed solution builds upon a deep collision encoder that is trained on both simulated and real depth images using supervised learning such that it compresses the high-dimensional depth data to a low-dimensional latent space encoding collision information while accounting for the robot size. This compressed encoding is combined with an estimate of the robot's odometry and the desired target location to train a deep reinforcement learning navigation policy that offers low-latency computation and robust sim2real performance. A set of simulation and experimental studies in diverse environments are conducted and demonstrate the efficiency of the emerged behavior and its resilience in real-life deployments.
