Uplifting Range-View-based 3D Semantic Segmentation in Real-Time with Multi-Sensor Fusion
Shiqi Tan, Hamidreza Fazlali, Yixuan Xu, Yuan Ren, Bingbing Liu
TL;DR
This work tackles the limitations of Range-View based 3D semantic segmentation for autonomous driving by introducing LaCRange, a multi-sensor fusion framework that distorts-free guidance from RGB images to a lightweight RV processor. It couples distortion-compensating knowledge distillation (DCKD) with a context-based feature fusion (CFF) module and a portable point refinement pipeline (SR^2FA and 3D-NAFA) to mitigate projection distortions and preserve 3D topology. Across SemanticKITTI and nuScenes, LaCRange delivers real-time performance and competitive or superior accuracy, with ablations showing substantial gains from 3D neighborhood augmentation, effective fusion strategies, and robust distillation. The proposed methods are modular and adaptable, offering plug-and-play improvements for existing RV-based segmentation pipelines and enabling more reliable perception in diverse driving conditions.
Abstract
Range-View(RV)-based 3D point cloud segmentation is widely adopted due to its compact data form. However, RV-based methods fall short in providing robust segmentation for the occluded points and suffer from distortion of projected RGB images due to the sparse nature of 3D point clouds. To alleviate these problems, we propose a new LiDAR and Camera Range-view-based 3D point cloud semantic segmentation method (LaCRange). Specifically, a distortion-compensating knowledge distillation (DCKD) strategy is designed to remedy the adverse effect of RV projection of RGB images. Moreover, a context-based feature fusion module is introduced for robust and preservative sensor fusion. Finally, in order to address the limited resolution of RV and its insufficiency of 3D topology, a new point refinement scheme is devised for proper aggregation of features in 2D and augmentation of point features in 3D. We evaluated the proposed method on large-scale autonomous driving datasets \ie SemanticKITTI and nuScenes. In addition to being real-time, the proposed method achieves state-of-the-art results on nuScenes benchmark
