Table of Contents
Fetching ...

LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training

Thomas Kreutz, Jens Lemke, Max Mühlhäuser, Alejandro Sanchez Guinea

Abstract

In this paper, we propose LiOn-XA, an unsupervised domain adaptation (UDA) approach that combines LiDAR-Only Cross-Modal (X) learning with Adversarial training for 3D LiDAR point cloud semantic segmentation to bridge the domain gap arising from environmental and sensor setup changes. Unlike existing works that exploit multiple data modalities like point clouds and RGB image data, we address UDA in scenarios where RGB images might not be available and show that two distinct LiDAR data representations can learn from each other for UDA. More specifically, we leverage 3D voxelized point clouds to preserve important geometric structure in combination with 2D projection-based range images that provide information such as object orientations or surfaces. To further align the feature space between both domains, we apply adversarial training using both features and predictions of both 2D and 3D neural networks. Our experiments on 3 real-to-real adaptation scenarios demonstrate the effectiveness of our approach, achieving new state-of-the-art performance when compared to previous uni- and multi-model UDA methods. Our source code is publicly available at https://github.com/JensLe97/lion-xa.

LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training

Abstract

In this paper, we propose LiOn-XA, an unsupervised domain adaptation (UDA) approach that combines LiDAR-Only Cross-Modal (X) learning with Adversarial training for 3D LiDAR point cloud semantic segmentation to bridge the domain gap arising from environmental and sensor setup changes. Unlike existing works that exploit multiple data modalities like point clouds and RGB image data, we address UDA in scenarios where RGB images might not be available and show that two distinct LiDAR data representations can learn from each other for UDA. More specifically, we leverage 3D voxelized point clouds to preserve important geometric structure in combination with 2D projection-based range images that provide information such as object orientations or surfaces. To further align the feature space between both domains, we apply adversarial training using both features and predictions of both 2D and 3D neural networks. Our experiments on 3 real-to-real adaptation scenarios demonstrate the effectiveness of our approach, achieving new state-of-the-art performance when compared to previous uni- and multi-model UDA methods. Our source code is publicly available at https://github.com/JensLe97/lion-xa.

Paper Structure

This paper contains 34 sections, 7 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: LiOn-XA consists of a source and target module. The source module optimizes the 2D and 3D networks with a supervised segmentation loss $\mathcal{L}_{sup}$ on the source domain data as well as target-like data. The target module contains unlabelled data only from the target domain. Both modules optimize both networks with a respective cross-modal loss $\mathcal{L}_{xM}(\cdot)$. Finally, the discriminator module further connects source and target representations for unsupervised domain adaptation using an adversarial loss on the feature representations $\mathcal{L}_{adv}$ to enforce feature alignment between both domains.
  • Figure 2: Range image consisting of a range, remission and normal map.
  • Figure 3: Qualitative results on the SemanticKITTI to nuScenes-Lidarseg adaptation scenario. $\blacksquare$ Vehicle, $\blacksquare$ Driveable surface, $\blacksquare$ Sidewalk, $\blacksquare$ Terrain, $\blacksquare$ Manmade, $\blacksquare$ Vegetation, $\blacksquare$ Ignore label.
  • Figure 4: The basic architecture of LiOn-XA. It is best viewed from both sides to the middle following the data flows.