Table of Contents
Fetching ...

Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera

Zhefan Xu, Xiaoyang Zhan, Yumeng Xiu, Christopher Suzuki, Kenji Shimada

TL;DR

The paper tackles real-time 3D dynamic obstacle perception for small onboard robots using RGB-D cameras, addressing the limitations of heavy LiDAR and GPU-based methods. It introduces a lightweight Dynamic Obstacle Detection and Tracking (DODT) framework that ensembles three detectors—two non-learning (U-depth, DBSCAN) and one lightweight learning-based (YOLO-MAD)—to achieve robust, real-time 3D obstacle detection. A feature-based data association with a constant-acceleration Kalman filter for tracking, plus a dynamic/static identification module, enables reliable dynamic obstacle identification; an auxiliary learning-based detector can extend detection range when computational resources permit. The approach yields state-of-the-art performance on onboard hardware (e.g., position error around $0.11$ m and velocity error around $0.23$ m/s) and supports effective autonomous navigation in dynamic indoor environments, with real-time operation demonstrated on quadcopter platforms. Sensor-fusion and improved occlusion handling are suggested as future directions to further enhance robustness.

Abstract

Deploying autonomous robots in crowded indoor environments usually requires them to have accurate dynamic obstacle perception. Although plenty of previous works in the autonomous driving field have investigated the 3D object detection problem, the usage of dense point clouds from a heavy Light Detection and Ranging (LiDAR) sensor and their high computation cost for learning-based data processing make those methods not applicable to small robots, such as vision-based UAVs with small onboard computers. To address this issue, we propose a lightweight 3D dynamic obstacle detection and tracking (DODT) method based on an RGB-D camera, which is designed for low-power robots with limited computing power. Our method adopts a novel ensemble detection strategy, combining multiple computationally efficient but low-accuracy detectors to achieve real-time high-accuracy obstacle detection. Besides, we introduce a new feature-based data association and tracking method to prevent mismatches utilizing point clouds' statistical features. In addition, our system includes an optional and auxiliary learning-based module to enhance the obstacle detection range and dynamic obstacle identification. The proposed method is implemented in a small quadcopter, and the results show that our method can achieve the lowest position error (0.11m) and a comparable velocity error (0.23m/s) across the benchmarking algorithms running on the robot's onboard computer. The flight experiments prove that the tracking results from the proposed method can make the robot efficiently alter its trajectory for navigating dynamic environments. Our software is available on GitHub as an open-source ROS package.

Onboard dynamic-object detection and tracking for autonomous robot navigation with RGB-D camera

TL;DR

The paper tackles real-time 3D dynamic obstacle perception for small onboard robots using RGB-D cameras, addressing the limitations of heavy LiDAR and GPU-based methods. It introduces a lightweight Dynamic Obstacle Detection and Tracking (DODT) framework that ensembles three detectors—two non-learning (U-depth, DBSCAN) and one lightweight learning-based (YOLO-MAD)—to achieve robust, real-time 3D obstacle detection. A feature-based data association with a constant-acceleration Kalman filter for tracking, plus a dynamic/static identification module, enables reliable dynamic obstacle identification; an auxiliary learning-based detector can extend detection range when computational resources permit. The approach yields state-of-the-art performance on onboard hardware (e.g., position error around m and velocity error around m/s) and supports effective autonomous navigation in dynamic indoor environments, with real-time operation demonstrated on quadcopter platforms. Sensor-fusion and improved occlusion handling are suggested as future directions to further enhance robustness.

Abstract

Deploying autonomous robots in crowded indoor environments usually requires them to have accurate dynamic obstacle perception. Although plenty of previous works in the autonomous driving field have investigated the 3D object detection problem, the usage of dense point clouds from a heavy Light Detection and Ranging (LiDAR) sensor and their high computation cost for learning-based data processing make those methods not applicable to small robots, such as vision-based UAVs with small onboard computers. To address this issue, we propose a lightweight 3D dynamic obstacle detection and tracking (DODT) method based on an RGB-D camera, which is designed for low-power robots with limited computing power. Our method adopts a novel ensemble detection strategy, combining multiple computationally efficient but low-accuracy detectors to achieve real-time high-accuracy obstacle detection. Besides, we introduce a new feature-based data association and tracking method to prevent mismatches utilizing point clouds' statistical features. In addition, our system includes an optional and auxiliary learning-based module to enhance the obstacle detection range and dynamic obstacle identification. The proposed method is implemented in a small quadcopter, and the results show that our method can achieve the lowest position error (0.11m) and a comparable velocity error (0.23m/s) across the benchmarking algorithms running on the robot's onboard computer. The flight experiments prove that the tracking results from the proposed method can make the robot efficiently alter its trajectory for navigating dynamic environments. Our software is available on GitHub as an open-source ROS package.
Paper Structure (13 sections, 10 equations, 10 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 10 equations, 10 figures, 2 tables, 1 algorithm.

Figures (10)

  • Figure 1: The onboard dynamic obstacle detection results from the proposed DODT algorithm. (a) The camera RGB view. (b) An example of an autonomous robot with an RGB-D camera. (c) The onboard 3D dynamic obstacle detection results shown as blue bounding boxes with point clouds.
  • Figure 2: The proposed dynamic obstacle detection and tracking system (DODT) framework. The input data are the RGB-D images. The non-learning detection module first uses the depth image to detect generic obstacles. Then, the tracking module is applied to track and estimate the obstacles states. With the identification module, the dynamic obstacles are identified from all detected obstacles. Finally, the output results show the dynamic obstacles' bounding boxes. The dynamic obstacle regions are cleaned in the static occupancy map. The optional learning-based detection module, presented in the blue dotted line, uses color and depth images to detect dynamic obstacles, enhancing the detection range and dynamic obstacle identification.
  • Figure 3: Illustration of the U-depth detector. (a) The camera RGB view. (b) The detected 3D bounding box with the obstacle point cloud. (c) The 2D detection on the depth map. (d) The 2D detection on the U-depth map.
  • Figure 4: Illustration of the DBSCAN detector. (a) The robot encounters obstacles in a corridor. (b) The raw point cloud data from the RGB-D camera are unstructured and noisy. (c) The DBSCAN detector takes the filtered point cloud and performs clustering to get obstacles' bounding boxes.
  • Figure 5: Illustration of the YOLO-MAD detector. The RGB image is used to get the 2D detection result, and then the bounding box on the depth image is obtained. With the 2D result on the depth image, the 3D bounding box is calculated by the proposed median absolute deviation (MAD) method.
  • ...and 5 more figures