Table of Contents
Fetching ...

ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation

Bo Zhang, Xinyu Cai, Jiakang Yuan, Donglin Yang, Jianfei Guo, Xiangchao Yan, Renqiu Xia, Botian Shi, Min Dou, Tao Chen, Si Liu, Junchi Yan, Yu Qiao

TL;DR

This work addresses domain shifts in autonomous driving by proposing ReSimAD, a zero-shot transfer framework that reconstructs the old-domain 3D scene into a mesh and then renders target-domain-like data using a configurable simulator. The method decouples domain characteristics and creates a perception-simulation-perception loop, enabling zero-shot target-domain perception and potential 3D pre-training without real target-domain data. It introduces a Reconstruction-Simulation Dataset built from Waymo as the source and KITTI/nuScenes/ONCE as targets, and demonstrates superior zero-shot 3D detection across multiple cross-domain settings, often outperforming UDA baselines that require target-domain data. The approach promises practical impact by reducing data collection costs and accelerating deployment of new sensors or regions in autonomous driving.

Abstract

Domain shifts such as sensor type changes and geographical situation variations are prevalent in Autonomous Driving (AD), which poses a challenge since AD model relying on the previous domain knowledge can be hardly directly deployed to a new domain without additional costs. In this paper, we provide a new perspective and approach of alleviating the domain shifts, by proposing a Reconstruction-Simulation-Perception (ReSimAD) scheme. Specifically, the implicit reconstruction process is based on the knowledge from the previous old domain, aiming to convert the domain-related knowledge into domain-invariant representations, e.g., 3D scene-level meshes. Besides, the point clouds simulation process of multiple new domains is conditioned on the above reconstructed 3D meshes, where the target-domain-like simulation samples can be obtained, thus reducing the cost of collecting and annotating new-domain data for the subsequent perception process. For experiments, we consider different cross-domain situations such as Waymo-to-KITTI, Waymo-to-nuScenes, Waymo-to-ONCE, etc, to verify the zero-shot target-domain perception using ReSimAD. Results demonstrate that our method is beneficial to boost the domain generalization ability, even promising for 3D pre-training.

ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation

TL;DR

This work addresses domain shifts in autonomous driving by proposing ReSimAD, a zero-shot transfer framework that reconstructs the old-domain 3D scene into a mesh and then renders target-domain-like data using a configurable simulator. The method decouples domain characteristics and creates a perception-simulation-perception loop, enabling zero-shot target-domain perception and potential 3D pre-training without real target-domain data. It introduces a Reconstruction-Simulation Dataset built from Waymo as the source and KITTI/nuScenes/ONCE as targets, and demonstrates superior zero-shot 3D detection across multiple cross-domain settings, often outperforming UDA baselines that require target-domain data. The approach promises practical impact by reducing data collection costs and accelerating deployment of new sensors or regions in autonomous driving.

Abstract

Domain shifts such as sensor type changes and geographical situation variations are prevalent in Autonomous Driving (AD), which poses a challenge since AD model relying on the previous domain knowledge can be hardly directly deployed to a new domain without additional costs. In this paper, we provide a new perspective and approach of alleviating the domain shifts, by proposing a Reconstruction-Simulation-Perception (ReSimAD) scheme. Specifically, the implicit reconstruction process is based on the knowledge from the previous old domain, aiming to convert the domain-related knowledge into domain-invariant representations, e.g., 3D scene-level meshes. Besides, the point clouds simulation process of multiple new domains is conditioned on the above reconstructed 3D meshes, where the target-domain-like simulation samples can be obtained, thus reducing the cost of collecting and annotating new-domain data for the subsequent perception process. For experiments, we consider different cross-domain situations such as Waymo-to-KITTI, Waymo-to-nuScenes, Waymo-to-ONCE, etc, to verify the zero-shot target-domain perception using ReSimAD. Results demonstrate that our method is beneficial to boost the domain generalization ability, even promising for 3D pre-training.
Paper Structure (17 sections, 9 equations, 10 figures, 6 tables)

This paper contains 17 sections, 9 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Different paradigms for cross-domain Autonomous Driving (AD), where Rec. denotes the employed reconstruction scheme. (a) By directly simulating target-domain data from CARLA dosovitskiy2017carla, it expands the number of training samples from the target domain. But the diversity of the simulated data directly from CARLA is low. (b) Other works yang2021st3dyang2022st3d++yuan2023bi3dwei2022lidar often employ Unsupervised Domain Adaptation (UDA) to enable learning from target-domain data distribution. But this process needs to collect massive target-domain samples from the real world, which is expensive. (c) In the proposed ReSimAD paradigm, the well-labeled data from the old domain is utilized to reconstruct the 3D scene, and then, the target-domain-like data is simulated from the reconstructed scene, achieving promising zero-shot detection accuracy for the target domain.
  • Figure 2: Visualization results between (a) real domain and (b) simulated domain. The domain simulated by ReSimAD is close to the real domain, such as slope on the road.
  • Figure 2: Zero-shot and Fully-supervised (SFT and Oracle) results on the target domain. For SFT setting, we use the checkpoint pre-trained on the simulated data as the backbone initialization, and fine-tune on the labeled target domain.
  • Figure 3: The overview of ReSimAD, which consists of point-to-mesh reconstruction, mesh-to-point simulation, and zero-shot perception. Each part is detailed in Sec. \ref{['sec:4']}.
  • Figure 4: Distribution differences of object size (Length, Width, and Height) across datasets. Compared with the off-the-shelf public datasets, the simulation dataset constructed by the proposed ReSimAD covers a wider distribution.
  • ...and 5 more figures