Table of Contents
Fetching ...

RELAX: Reinforcement Learning Enabled 2D-LiDAR Autonomous System for Parsimonious UAVs

Guanlin Wu, Zhuokai Zhao, Yutao He

TL;DR

The paper addresses the cost barrier in autonomous UAV navigation by proposing RELAX, a framework that uses only a single 2D-LiDAR for mapping, offline planning, and online re-planning. It couples Hector-SLAM–based occupancy mapping with an offline RRT planner and a learning-based online re-planner (D3QN) to handle dynamic obstacles. Key contributions include the first end-to-end UAV navigation pipeline built on 2D-LiDAR, a modular architecture enabling real-time training in Gazebo-ROS-PX4, and a competitive performance relative to higher-cost sensor setups. The work demonstrates that cost-effective sensors can still deliver robust autonomous navigation with practical applicability and a path toward broader adoption of parsimonious UAVs.

Abstract

Unmanned Aerial Vehicles (UAVs) have become increasingly prominence in recent years, finding applications in surveillance, package delivery, among many others. Despite considerable efforts in developing algorithms that enable UAVs to navigate through complex unknown environments autonomously, they often require expensive hardware and sensors, such as RGB-D cameras and 3D-LiDAR, leading to a persistent trade-off between performance and cost. To this end, we propose RELAX, a novel end-to-end autonomous framework that is exceptionally cost-efficient, requiring only a single 2D-LiDAR to enable UAVs operating in unknown environments. Specifically, RELAX comprises three components: a pre-processing map constructor; an offline mission planner; and a reinforcement learning (RL)-based online re-planner. Experiments demonstrate that RELAX offers more robust dynamic navigation compared to existing algorithms, while only costing a fraction of the others. The code will be made public upon acceptance.

RELAX: Reinforcement Learning Enabled 2D-LiDAR Autonomous System for Parsimonious UAVs

TL;DR

The paper addresses the cost barrier in autonomous UAV navigation by proposing RELAX, a framework that uses only a single 2D-LiDAR for mapping, offline planning, and online re-planning. It couples Hector-SLAM–based occupancy mapping with an offline RRT planner and a learning-based online re-planner (D3QN) to handle dynamic obstacles. Key contributions include the first end-to-end UAV navigation pipeline built on 2D-LiDAR, a modular architecture enabling real-time training in Gazebo-ROS-PX4, and a competitive performance relative to higher-cost sensor setups. The work demonstrates that cost-effective sensors can still deliver robust autonomous navigation with practical applicability and a path toward broader adoption of parsimonious UAVs.

Abstract

Unmanned Aerial Vehicles (UAVs) have become increasingly prominence in recent years, finding applications in surveillance, package delivery, among many others. Despite considerable efforts in developing algorithms that enable UAVs to navigate through complex unknown environments autonomously, they often require expensive hardware and sensors, such as RGB-D cameras and 3D-LiDAR, leading to a persistent trade-off between performance and cost. To this end, we propose RELAX, a novel end-to-end autonomous framework that is exceptionally cost-efficient, requiring only a single 2D-LiDAR to enable UAVs operating in unknown environments. Specifically, RELAX comprises three components: a pre-processing map constructor; an offline mission planner; and a reinforcement learning (RL)-based online re-planner. Experiments demonstrate that RELAX offers more robust dynamic navigation compared to existing algorithms, while only costing a fraction of the others. The code will be made public upon acceptance.
Paper Structure (8 sections, 9 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 8 sections, 9 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: System overview: RELAX starts from checking whether the occupancy grid map exists. If there is no map, it will run map constructor to enter the map constructing mode. While we manually operate the drone to fly one complete circuit around the environment at a specific altitude, map constructor processes the data from 2D-LiDAR and integrates these data to create an occupancy grid map. This map is then sent back to resources and available to other modules. Next, mission planner subscribes this map and use it to plan an obstacle-free path from start to target and sends to online re-planner for dynamic obstacle avoidance using real-time 2D-LiDAR inputs.
  • Figure 2: Left: environment of UAV at a particular position; Right: "raw" 2D-LiDAR scanning image of the left environment.
  • Figure 3: Left: occupancy map constructed by Hector-SLAM. Right: a path planning result based on the given occupancy map.
  • Figure 4: Left: state-action correspondence, where agent can choose or exclude action [$1, -1, 0$] based on the distance. Right: training environment of the model.
  • Figure 5: Left: training environment used for fine-tuning after training in environment shown in Fig. \ref{['fig:sta_act_train_env']} right. Right: a typical farmland environment, where delineated regions are labeled as $a, b, c$, and etc., and are separated by the movable iron bars and wires (shown as red dotted lines). The objective of the case study is to navigate a parsimonious UAV from house to tower for nocturnal inspections.
  • ...and 1 more figures