Table of Contents
Fetching ...

NIV-SSD: Neighbor IoU-Voting Single-Stage Object Detector From Point Cloud

Shuai Liu, Di Wang, Quan Wang, Kai Huang

TL;DR

The paper tackles the persistent misalignment between localization quality and classification confidence in LiDAR-based 3D object detection by introducing a post-processing Neighbor IoU-Voting (NIV) strategy that rectifies confidence using neighbor-derived statistics, without altering network architecture. It pairs NIV with an object resampling augmentation to address the imbalance between easy and difficult objects, producing an efficient single-stage detector called NIV-SSD. Through extensive experiments on KITTI, ONCE, and Waymo, NIV-SSD demonstrates improved confidence calibration, competitive accuracy, and favorable speed-accuracy trade-offs, validating the generality of NIV across datasets. The approach offers practical impact by providing a plug-in rectification method and a simple augmentation to boost performance in real-time autonomous driving systems.

Abstract

Previous single-stage detectors typically suffer the misalignment between localization accuracy and classification confidence. To solve the misalignment problem, we introduce a novel rectification method named neighbor IoU-voting (NIV) strategy. Typically, classification and regression are treated as separate branches, making it challenging to establish a connection between them. Consequently, the classification confidence cannot accurately reflect the regression quality. NIV strategy can serve as a bridge between classification and regression branches by calculating two types of statistical data from the regression output to correct the classification confidence. Furthermore, to alleviate the imbalance of detection accuracy for complete objects with dense points (easy objects) and incomplete objects with sparse points (difficult objects), we propose a new data augmentation scheme named object resampling. It undersamples easy objects and oversamples difficult objects by randomly transforming part of easy objects into difficult objects. Finally, combining the NIV strategy and object resampling augmentation, we design an efficient single-stage detector termed NIV-SSD. Extensive experiments on several datasets indicate the effectiveness of the NIV strategy and the competitive performance of the NIV-SSD detector. The code will be available at https://github.com/Say2L/NIV-SSD.

NIV-SSD: Neighbor IoU-Voting Single-Stage Object Detector From Point Cloud

TL;DR

The paper tackles the persistent misalignment between localization quality and classification confidence in LiDAR-based 3D object detection by introducing a post-processing Neighbor IoU-Voting (NIV) strategy that rectifies confidence using neighbor-derived statistics, without altering network architecture. It pairs NIV with an object resampling augmentation to address the imbalance between easy and difficult objects, producing an efficient single-stage detector called NIV-SSD. Through extensive experiments on KITTI, ONCE, and Waymo, NIV-SSD demonstrates improved confidence calibration, competitive accuracy, and favorable speed-accuracy trade-offs, validating the generality of NIV across datasets. The approach offers practical impact by providing a plug-in rectification method and a simple augmentation to boost performance in real-time autonomous driving systems.

Abstract

Previous single-stage detectors typically suffer the misalignment between localization accuracy and classification confidence. To solve the misalignment problem, we introduce a novel rectification method named neighbor IoU-voting (NIV) strategy. Typically, classification and regression are treated as separate branches, making it challenging to establish a connection between them. Consequently, the classification confidence cannot accurately reflect the regression quality. NIV strategy can serve as a bridge between classification and regression branches by calculating two types of statistical data from the regression output to correct the classification confidence. Furthermore, to alleviate the imbalance of detection accuracy for complete objects with dense points (easy objects) and incomplete objects with sparse points (difficult objects), we propose a new data augmentation scheme named object resampling. It undersamples easy objects and oversamples difficult objects by randomly transforming part of easy objects into difficult objects. Finally, combining the NIV strategy and object resampling augmentation, we design an efficient single-stage detector termed NIV-SSD. Extensive experiments on several datasets indicate the effectiveness of the NIV strategy and the competitive performance of the NIV-SSD detector. The code will be available at https://github.com/Say2L/NIV-SSD.
Paper Structure (21 sections, 13 figures, 9 tables, 1 algorithm)

This paper contains 21 sections, 13 figures, 9 tables, 1 algorithm.

Figures (13)

  • Figure 1: Comparisons on speed and accuracy. Results are obtained on 3D car detection in the KITTI test set.
  • Figure 2: Scatterplots: (a) real IoU vs. NIV score (w/ mIoU) which denotes the mean IoU between a predicted box and its neighbors; and (b) real IoU vs. NIV score (w/ all) which denotes the combination of the mean IoU and the number of neighbors. "PCC" denotes the Pearson correlation coefficient.
  • Figure 3: The detection pipeline of our NIV-SSD. First, a point cloud is transformed into voxels. Next, the voxels are fed to a 3D backbone which is composed of 3D sparse convolutions. A 2D feature map is generated by the 3D backbone. Then, a 2D backbone is used to extract features from the 2D feature map, and a multi-task head module is utilized to produce multi-task predictions. Finally, the neighbor IoU-voting (NIV) strategy is adopted to rectify classification confidences, and NMS is used to filter redundant predictions.
  • Figure 4: A diagram of replacing a traditional convolution layer with a ConvNeXt block.
  • Figure 5: A simple example of NIV calculating.
  • ...and 8 more figures