Vehicle Detection from 3D Lidar Using Fully Convolutional Network
Bo Li, Tianlei Zhang, Tian Xia
TL;DR
This work addresses 3D vehicle detection from LiDAR range scans by projecting the data into a 2D point map and applying a single end-to-end fully convolutional network to predict per-point objectness and 3D bounding boxes. A novel 24D bounding-box encoding, derived from a per-point local coordinate system and rotation-invariant transforms, enables accurate 3D localization using 2D CNNs. The model is trained with balanced, multi-task losses and augmented data, and detections are refined via non-maximum suppression. On the KITTI dataset, the method achieves state-of-the-art or competitive performance in both offline world-space metrics and online evaluations, demonstrating the viability of FCN-based detection on lidar range data.
Abstract
Convolutional network techniques have recently achieved great success in vision based detection tasks. This paper introduces the recent development of our research on transplanting the fully convolutional network technique to the detection tasks on 3D range scan data. Specifically, the scenario is set as the vehicle detection task from the range data of Velodyne 64E lidar. We proposes to present the data in a 2D point map and use a single 2D end-to-end fully convolutional network to predict the objectness confidence and the bounding boxes simultaneously. By carefully design the bounding box encoding, it is able to predict full 3D bounding boxes even using a 2D convolutional network. Experiments on the KITTI dataset shows the state-of-the-art performance of the proposed method.
