Table of Contents
Fetching ...

DRUM: Diffusion-based Raydrop-aware Unpaired Mapping for Sim2Real LiDAR Segmentation

Tomoya Miyawaki, Kazuto Nakashima, Yumi Iwashita, Ryo Kurazume

Abstract

LiDAR-based semantic segmentation is a key component for autonomous mobile robots, yet large-scale annotation of LiDAR point clouds is prohibitively expensive and time-consuming. Although simulators can provide labeled synthetic data, models trained on synthetic data often underperform on real-world data due to a data-level domain gap. To address this issue, we propose DRUM, a novel Sim2Real translation framework. We leverage a diffusion model pre-trained on unlabeled real-world data as a generative prior and translate synthetic data by reproducing two key measurement characteristics: reflectance intensity and raydrop noise. To improve sample fidelity, we introduce a raydrop-aware masked guidance mechanism that selectively enforces consistency with the input synthetic data while preserving realistic raydrop noise induced by the diffusion prior. Experimental results demonstrate that DRUM consistently improves Sim2Real performance across multiple representations of LiDAR data. The project page is available at https://miya-tomoya.github.io/drum.

DRUM: Diffusion-based Raydrop-aware Unpaired Mapping for Sim2Real LiDAR Segmentation

Abstract

LiDAR-based semantic segmentation is a key component for autonomous mobile robots, yet large-scale annotation of LiDAR point clouds is prohibitively expensive and time-consuming. Although simulators can provide labeled synthetic data, models trained on synthetic data often underperform on real-world data due to a data-level domain gap. To address this issue, we propose DRUM, a novel Sim2Real translation framework. We leverage a diffusion model pre-trained on unlabeled real-world data as a generative prior and translate synthetic data by reproducing two key measurement characteristics: reflectance intensity and raydrop noise. To improve sample fidelity, we introduce a raydrop-aware masked guidance mechanism that selectively enforces consistency with the input synthetic data while preserving realistic raydrop noise induced by the diffusion prior. Experimental results demonstrate that DRUM consistently improves Sim2Real performance across multiple representations of LiDAR data. The project page is available at https://miya-tomoya.github.io/drum.

Paper Structure

This paper contains 13 sections, 8 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Problem formulation. Given unpaired sets of labeled simulation and unlabeled real samples, our framework DRUM generates labeled pseudo-real samples for training LiDAR segmentation.
  • Figure 2: Overview of Sim2Real translation by DRUM. (a) The unconditional generation process of diffusion models is conditioned by the masked guidance with the simulation sample $\bm{y}$. (b) In masked guidance, we first generate the raydrop-aware $\bm{m}_t$ mask from the tentative Tweedie sample $\hat{\bm{x}}_t$ and then compute the sim–real discrepancy based on the pseudoinverse method song2023pseudoinverse-guided. The operator $H$ corrupts the reflectance modality.
  • Figure 3: Ablation of our masked guidance. We compare the range (top row) and reflectance (bottom row) pseudo-real samples produced with and without our raydrop-aware masked guidance. Our method successfully reproduces the raydrop noise on the car, as well as the reflectance modality.
  • Figure 4: Qualitative comparison of pseudo-real samples. We compare the range (top row) and reflectance (bottom row) samples produced by different methods. The reflectance input is from the rendering model xiao2022transfer while we do not use it for producing the pseudo-real samples. Our method shows better results in terms of consistency to the input and fidelity comparable to the reference.
  • Figure 5: Qualitative comparison of semantic segmentation results. We show the results of the image-based method on the representation of images (top row) and point clouds (bottom row). Our approach significantly improves $$∎ car detection (black circles) compared to the baseline.
  • ...and 2 more figures