LV-DOT: LiDAR-visual dynamic obstacle detection and tracking for autonomous robot navigation
Zhefan Xu, Haoyu Shen, Xinming Han, Hanyu Jin, Kanlong Ye, Kenji Shimada
TL;DR
This work tackles the challenge of real-time dynamic obstacle perception for indoor mobile robots with limited sensors and computation. It proposes LV-DOT, a LiDAR-visual fusion framework that combines lightweight detectors from LiDAR, depth, and color camera streams, followed by a Kalman-filter-based tracker and dynamic identification. The method demonstrated superior F1 scores across IoU thresholds on a self-collected indoor dataset and validated real-time feasibility through physical quadcopter experiments, highlighting practical benefits for safe navigation. The open-source ROS implementation and ablation analyses elucidate the contributions of each sensor stream, supporting robust performance with lightweight hardware. Overall, LV-DOT offers a practical, sensor-efficient solution for indoor autonomous navigation in cluttered environments.
Abstract
Accurate perception of dynamic obstacles is essential for autonomous robot navigation in indoor environments. Although sophisticated 3D object detection and tracking methods have been investigated and developed thoroughly in the fields of computer vision and autonomous driving, their demands on expensive and high-accuracy sensor setups and substantial computational resources from large neural networks make them unsuitable for indoor robotics. Recently, more lightweight perception algorithms leveraging onboard cameras or LiDAR sensors have emerged as promising alternatives. However, relying on a single sensor poses significant limitations: cameras have limited fields of view and can suffer from high noise, whereas LiDAR sensors operate at lower frequencies and lack the richness of visual features. To address this limitation, we propose a dynamic obstacle detection and tracking framework that uses both onboard camera and LiDAR data to enable lightweight and accurate perception. Our proposed method expands on our previous ensemble detection approach, which integrates outputs from multiple low-accuracy but computationally efficient detectors to ensure real-time performance on the onboard computer. In this work, we propose a more robust fusion strategy that integrates both LiDAR and visual data to enhance detection accuracy further. We then utilize a tracking module that adopts feature-based object association and the Kalman filter to track and estimate detected obstacles' states. Besides, a dynamic obstacle classification algorithm is designed to robustly identify moving objects. The dataset evaluation demonstrates a better perception performance compared to benchmark methods. The physical experiments on a quadcopter robot confirms the feasibility for real-world navigation.
