From One to the Power of Many: Invariance to Multi-LiDAR Perception from Single-Sensor Datasets
Marc Uecker, J. Marius Zöllner
TL;DR
The paper tackles the challenge of transferring LiDAR semantic segmentation models trained on single-sensor data to multi-sensor autonomous vehicle setups. It introduces two data augmentations, Frustum Drop and Mis-Calibration, to simulate multi-LiDAR effects and improve invariance, and proposes Normalized Feature Similarity ($NFS$) as a label-free proxy for cross-domain generalization. In simulations and real data, the augmentations boost invariant feature representations, and $NFS$ correlates strongly with out-of-domain performance (e.g., $rmIoU \approx 1.04 \times NFS + 1.63$, $R^2 = 0.916$). The approach reduces the performance gap when deploying single-sensor trained models to fused multi-sensor inputs, enabling more robust zero-shot generalization with potential for self-supervised continuation using $NFS$ as a supervisory signal.
Abstract
Recently, LiDAR segmentation methods for autonomous vehicles, powered by deep neural networks, have experienced steep growth in performance on classic benchmarks, such as nuScenes and SemanticKITTI. However, there are still large gaps in performance when deploying models trained on such single-sensor setups to modern vehicles with multiple high-resolution LiDAR sensors. In this work, we introduce a new metric for feature-level invariance which can serve as a proxy to measure cross-domain generalization without requiring labeled data. Additionally, we propose two application-specific data augmentations, which facilitate better transfer to multi-sensor LiDAR setups, when trained on single-sensor datasets. We provide experimental evidence on both simulated and real data, that our proposed augmentations improve invariance across LiDAR setups, leading to improved generalization.
