Table of Contents
Fetching ...

Flying in Highly Dynamic Environments with End-to-end Learning Approach

Xiyu Fan, Minghao Lu, Bowen Xu, Peng Lu

TL;DR

The paper tackles autonomous obstacle avoidance for quadrotors in highly dynamic cluttered environments, where static planners struggle. It introduces an end-to-end framework that encodes lidar point clouds into a fixed-size 2D obstacle map and trains a neural policy to output horizontal acceleration commands using reinforcement learning. Key contributions include a novel lidar data encoding scheme, end-to-end training that handles both static and dynamic obstacles, and extensive simulations and real-world demonstrations showing reduced latency and robust high-speed obstacle avoidance. The approach offers portable on-board perception-to-action capabilities suitable for real-time operation in cluttered environments, with future work extending to full 3D maneuvers and stability improvements.

Abstract

Obstacle avoidance for unmanned aerial vehicles like quadrotors is a popular research topic. Most existing research focuses only on static environments, and obstacle avoidance in environments with multiple dynamic obstacles remains challenging. This paper proposes a novel deep-reinforcement learning-based approach for the quadrotors to navigate through highly dynamic environments. We propose a lidar data encoder to extract obstacle information from the massive point cloud data from the lidar. Multi frames of historical scans will be compressed into a 2-dimension obstacle map while maintaining the obstacle features required. An end-to-end deep neural network is trained to extract the kinematics of dynamic and static obstacles from the obstacle map, and it will generate acceleration commands to the quadrotor to control it to avoid these obstacles. Our approach contains perception and navigating functions in a single neural network, which can change from a navigating state into a hovering state without mode switching. We also present simulations and real-world experiments to show the effectiveness of our approach while navigating in highly dynamic cluttered environments.

Flying in Highly Dynamic Environments with End-to-end Learning Approach

TL;DR

The paper tackles autonomous obstacle avoidance for quadrotors in highly dynamic cluttered environments, where static planners struggle. It introduces an end-to-end framework that encodes lidar point clouds into a fixed-size 2D obstacle map and trains a neural policy to output horizontal acceleration commands using reinforcement learning. Key contributions include a novel lidar data encoding scheme, end-to-end training that handles both static and dynamic obstacles, and extensive simulations and real-world demonstrations showing reduced latency and robust high-speed obstacle avoidance. The approach offers portable on-board perception-to-action capabilities suitable for real-time operation in cluttered environments, with future work extending to full 3D maneuvers and stability improvements.

Abstract

Obstacle avoidance for unmanned aerial vehicles like quadrotors is a popular research topic. Most existing research focuses only on static environments, and obstacle avoidance in environments with multiple dynamic obstacles remains challenging. This paper proposes a novel deep-reinforcement learning-based approach for the quadrotors to navigate through highly dynamic environments. We propose a lidar data encoder to extract obstacle information from the massive point cloud data from the lidar. Multi frames of historical scans will be compressed into a 2-dimension obstacle map while maintaining the obstacle features required. An end-to-end deep neural network is trained to extract the kinematics of dynamic and static obstacles from the obstacle map, and it will generate acceleration commands to the quadrotor to control it to avoid these obstacles. Our approach contains perception and navigating functions in a single neural network, which can change from a navigating state into a hovering state without mode switching. We also present simulations and real-world experiments to show the effectiveness of our approach while navigating in highly dynamic cluttered environments.

Paper Structure

This paper contains 17 sections, 16 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Our quadrotor flies through a dynamic cluttered environment, which contains both static obstacles and randomly walking pedestrians.
  • Figure 2: The overall architecture of our system. Raw data from lidar is processed into single-dimension range data, and stacked into a 36 × 36 obstacle map. The obstacle map is then fed into the encoder network, and the features extracted by the encoder are fed into the MLP together with the command and quadrotor state information. The output of the MLP is the acceleration command of 2 axes.
  • Figure 3: The sketch of obstacles and the corresponding obstacle map. (a) shows the kinematics of 2 moving objects, and (b) is the corresponding sketch of the obstacle map. The area in the obstacle map with higher gray levels represents an obstacle closer to the quadrotor. (c) shows the learning curve of the agent using the proposed encoded lidar data and the raw lidar data.
  • Figure 4: Our training environment in Unity. Static and dynamic obstacles are randomly generated. (a) is the training environment in Unity. (b) shows the dynamic obstacle reward $r^i_d(t)$ in Eq. \ref{['e:dyn_reward']}. (c) demonstrates the dilation ratio $k^i(t)$ in Eq. \ref{['e:dilation']}. The dynamic obstacle is moving at the speed of 5 $m/s$ in the calculation of the reward function in (b) and the dilation ratio in (c).
  • Figure 5: The quadrotor maneuvers through 5 distinct dynamic and cluttered environments in simulation. The scenarios labeled (a) to (e) correspond to scenarios 1 to 5 outlined in Table \ref{['t:planning comp1']}, respectively.
  • ...and 2 more figures