Table of Contents
Fetching ...

LV-DOT: LiDAR-visual dynamic obstacle detection and tracking for autonomous robot navigation

Zhefan Xu, Haoyu Shen, Xinming Han, Hanyu Jin, Kanlong Ye, Kenji Shimada

TL;DR

This work tackles the challenge of real-time dynamic obstacle perception for indoor mobile robots with limited sensors and computation. It proposes LV-DOT, a LiDAR-visual fusion framework that combines lightweight detectors from LiDAR, depth, and color camera streams, followed by a Kalman-filter-based tracker and dynamic identification. The method demonstrated superior F1 scores across IoU thresholds on a self-collected indoor dataset and validated real-time feasibility through physical quadcopter experiments, highlighting practical benefits for safe navigation. The open-source ROS implementation and ablation analyses elucidate the contributions of each sensor stream, supporting robust performance with lightweight hardware. Overall, LV-DOT offers a practical, sensor-efficient solution for indoor autonomous navigation in cluttered environments.

Abstract

Accurate perception of dynamic obstacles is essential for autonomous robot navigation in indoor environments. Although sophisticated 3D object detection and tracking methods have been investigated and developed thoroughly in the fields of computer vision and autonomous driving, their demands on expensive and high-accuracy sensor setups and substantial computational resources from large neural networks make them unsuitable for indoor robotics. Recently, more lightweight perception algorithms leveraging onboard cameras or LiDAR sensors have emerged as promising alternatives. However, relying on a single sensor poses significant limitations: cameras have limited fields of view and can suffer from high noise, whereas LiDAR sensors operate at lower frequencies and lack the richness of visual features. To address this limitation, we propose a dynamic obstacle detection and tracking framework that uses both onboard camera and LiDAR data to enable lightweight and accurate perception. Our proposed method expands on our previous ensemble detection approach, which integrates outputs from multiple low-accuracy but computationally efficient detectors to ensure real-time performance on the onboard computer. In this work, we propose a more robust fusion strategy that integrates both LiDAR and visual data to enhance detection accuracy further. We then utilize a tracking module that adopts feature-based object association and the Kalman filter to track and estimate detected obstacles' states. Besides, a dynamic obstacle classification algorithm is designed to robustly identify moving objects. The dataset evaluation demonstrates a better perception performance compared to benchmark methods. The physical experiments on a quadcopter robot confirms the feasibility for real-world navigation.

LV-DOT: LiDAR-visual dynamic obstacle detection and tracking for autonomous robot navigation

TL;DR

This work tackles the challenge of real-time dynamic obstacle perception for indoor mobile robots with limited sensors and computation. It proposes LV-DOT, a LiDAR-visual fusion framework that combines lightweight detectors from LiDAR, depth, and color camera streams, followed by a Kalman-filter-based tracker and dynamic identification. The method demonstrated superior F1 scores across IoU thresholds on a self-collected indoor dataset and validated real-time feasibility through physical quadcopter experiments, highlighting practical benefits for safe navigation. The open-source ROS implementation and ablation analyses elucidate the contributions of each sensor stream, supporting robust performance with lightweight hardware. Overall, LV-DOT offers a practical, sensor-efficient solution for indoor autonomous navigation in cluttered environments.

Abstract

Accurate perception of dynamic obstacles is essential for autonomous robot navigation in indoor environments. Although sophisticated 3D object detection and tracking methods have been investigated and developed thoroughly in the fields of computer vision and autonomous driving, their demands on expensive and high-accuracy sensor setups and substantial computational resources from large neural networks make them unsuitable for indoor robotics. Recently, more lightweight perception algorithms leveraging onboard cameras or LiDAR sensors have emerged as promising alternatives. However, relying on a single sensor poses significant limitations: cameras have limited fields of view and can suffer from high noise, whereas LiDAR sensors operate at lower frequencies and lack the richness of visual features. To address this limitation, we propose a dynamic obstacle detection and tracking framework that uses both onboard camera and LiDAR data to enable lightweight and accurate perception. Our proposed method expands on our previous ensemble detection approach, which integrates outputs from multiple low-accuracy but computationally efficient detectors to ensure real-time performance on the onboard computer. In this work, we propose a more robust fusion strategy that integrates both LiDAR and visual data to enhance detection accuracy further. We then utilize a tracking module that adopts feature-based object association and the Kalman filter to track and estimate detected obstacles' states. Besides, a dynamic obstacle classification algorithm is designed to robustly identify moving objects. The dataset evaluation demonstrates a better perception performance compared to benchmark methods. The physical experiments on a quadcopter robot confirms the feasibility for real-world navigation.

Paper Structure

This paper contains 13 sections, 4 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Visualization of dynamic obstacle perception using the proposed framework, with the point cloud of tracked dynamic obstacles highlighted.
  • Figure 2: The proposed LiDAR-visual dynamic obstacle detection and tracking framework. Given the LiDAR scan, robot odometry, and RGB-D image, the LiDAR-visual fusion module integrates unclassified obstacle bounding boxes from both the LiDAR and visual (depth and color) detection modules to produce more accurate obstacle detections. The Tracking module then estimates the states of these obstacles and classifies them as static or dynamic.
  • Figure 3: Illustration of the LiDAR obstacle detection process. (a) The scene captured from the front RGB camera. (b) The LiDAR scan point cloud with robot pose and the camera field of view visualized. (c) The downsampled point cloud. (d) The detected obstacle axis-aligned bounding boxes.
  • Figure 4: Illustration of the visual U-depth 3D obstacle detection. (a) The RGB image view. (b) The 2D bounding boxes detected in the depth image. (c) The detected 3D bounding boxes with the camera's field of view visualization. (d) The 2D bounding boxes detected from the U-depth map.
  • Figure 5: Examples of qualitative experimental results from five testing environments, where the robot, equipped with a LiDAR and a camera, is positioned at the center of each environment. The top row displays point cloud data from the robot’s sensors and highlights detected and tracked dynamic obstacles. The bottom row presents corresponding RGB images from the robot’s camera which provides a visual perspective of the corresponding environments.
  • ...and 2 more figures