Table of Contents
Fetching ...

From One to the Power of Many: Invariance to Multi-LiDAR Perception from Single-Sensor Datasets

Marc Uecker, J. Marius Zöllner

TL;DR

The paper tackles the challenge of transferring LiDAR semantic segmentation models trained on single-sensor data to multi-sensor autonomous vehicle setups. It introduces two data augmentations, Frustum Drop and Mis-Calibration, to simulate multi-LiDAR effects and improve invariance, and proposes Normalized Feature Similarity ($NFS$) as a label-free proxy for cross-domain generalization. In simulations and real data, the augmentations boost invariant feature representations, and $NFS$ correlates strongly with out-of-domain performance (e.g., $rmIoU \approx 1.04 \times NFS + 1.63$, $R^2 = 0.916$). The approach reduces the performance gap when deploying single-sensor trained models to fused multi-sensor inputs, enabling more robust zero-shot generalization with potential for self-supervised continuation using $NFS$ as a supervisory signal.

Abstract

Recently, LiDAR segmentation methods for autonomous vehicles, powered by deep neural networks, have experienced steep growth in performance on classic benchmarks, such as nuScenes and SemanticKITTI. However, there are still large gaps in performance when deploying models trained on such single-sensor setups to modern vehicles with multiple high-resolution LiDAR sensors. In this work, we introduce a new metric for feature-level invariance which can serve as a proxy to measure cross-domain generalization without requiring labeled data. Additionally, we propose two application-specific data augmentations, which facilitate better transfer to multi-sensor LiDAR setups, when trained on single-sensor datasets. We provide experimental evidence on both simulated and real data, that our proposed augmentations improve invariance across LiDAR setups, leading to improved generalization.

From One to the Power of Many: Invariance to Multi-LiDAR Perception from Single-Sensor Datasets

TL;DR

The paper tackles the challenge of transferring LiDAR semantic segmentation models trained on single-sensor data to multi-sensor autonomous vehicle setups. It introduces two data augmentations, Frustum Drop and Mis-Calibration, to simulate multi-LiDAR effects and improve invariance, and proposes Normalized Feature Similarity () as a label-free proxy for cross-domain generalization. In simulations and real data, the augmentations boost invariant feature representations, and correlates strongly with out-of-domain performance (e.g., , ). The approach reduces the performance gap when deploying single-sensor trained models to fused multi-sensor inputs, enabling more robust zero-shot generalization with potential for self-supervised continuation using as a supervisory signal.

Abstract

Recently, LiDAR segmentation methods for autonomous vehicles, powered by deep neural networks, have experienced steep growth in performance on classic benchmarks, such as nuScenes and SemanticKITTI. However, there are still large gaps in performance when deploying models trained on such single-sensor setups to modern vehicles with multiple high-resolution LiDAR sensors. In this work, we introduce a new metric for feature-level invariance which can serve as a proxy to measure cross-domain generalization without requiring labeled data. Additionally, we propose two application-specific data augmentations, which facilitate better transfer to multi-sensor LiDAR setups, when trained on single-sensor datasets. We provide experimental evidence on both simulated and real data, that our proposed augmentations improve invariance across LiDAR setups, leading to improved generalization.
Paper Structure (25 sections, 4 equations, 7 figures, 2 tables)

This paper contains 25 sections, 4 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: We tackle the problem of zero-shot generalization to unseen data from modern multi-LiDAR vehicles, for which labels are not available.
  • Figure 2: Correlation between our Normalized Feature Similarity metric and out-of-distribution relative mIoU Score (rmIoU, as a percentage of in-domain test set mIoU) for a variety of models and sensor setups.
  • Figure 3: The two augmentations proposed in this work
  • Figure 4: Comparing relative mIoU score (left) and NFS (right) of our proposed augmentations on the test partition of the simulated out-of-domain Corner sensor setups shown in \ref{['fig:4:feature_similarity']}.
  • Figure 5: Comparing relative mIoU score (left) and NFS (right) of our proposed augmentations on the test partition of variations of the training setup with varying vertical LiDAR resolution (# of channels). The in-domain setup has 64 channels. mIoU scores are listed in \ref{['tab:results:corners_channels']}.
  • ...and 2 more figures