Table of Contents
Fetching ...

LVIC: Multi-modality segmentation by Lifting Visual Info as Cue

Zichao Dong, Bowen Pang, Xufeng Huang, Hang Ji, Xin Zhan, Junbo Chen

TL;DR

This work proposes a depth aware point painting mechanism, which significantly boosts the multi-modality fusion of LiDAR semantic segmentation and takes a deeper look at the desired visual feature for LiDAR to operate semantic segmentation.

Abstract

Multi-modality fusion is proven an effective method for 3d perception for autonomous driving. However, most current multi-modality fusion pipelines for LiDAR semantic segmentation have complicated fusion mechanisms. Point painting is a quite straight forward method which directly bind LiDAR points with visual information. Unfortunately, previous point painting like methods suffer from projection error between camera and LiDAR. In our experiments, we find that this projection error is the devil in point painting. As a result of that, we propose a depth aware point painting mechanism, which significantly boosts the multi-modality fusion. Apart from that, we take a deeper look at the desired visual feature for LiDAR to operate semantic segmentation. By Lifting Visual Information as Cue, LVIC ranks 1st on nuScenes LiDAR semantic segmentation benchmark. Our experiments show the robustness and effectiveness. Codes would be make publicly available soon.

LVIC: Multi-modality segmentation by Lifting Visual Info as Cue

TL;DR

This work proposes a depth aware point painting mechanism, which significantly boosts the multi-modality fusion of LiDAR semantic segmentation and takes a deeper look at the desired visual feature for LiDAR to operate semantic segmentation.

Abstract

Multi-modality fusion is proven an effective method for 3d perception for autonomous driving. However, most current multi-modality fusion pipelines for LiDAR semantic segmentation have complicated fusion mechanisms. Point painting is a quite straight forward method which directly bind LiDAR points with visual information. Unfortunately, previous point painting like methods suffer from projection error between camera and LiDAR. In our experiments, we find that this projection error is the devil in point painting. As a result of that, we propose a depth aware point painting mechanism, which significantly boosts the multi-modality fusion. Apart from that, we take a deeper look at the desired visual feature for LiDAR to operate semantic segmentation. By Lifting Visual Information as Cue, LVIC ranks 1st on nuScenes LiDAR semantic segmentation benchmark. Our experiments show the robustness and effectiveness. Codes would be make publicly available soon.
Paper Structure (22 sections, 2 figures, 1 table)

This paper contains 22 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Pipeline of our visual encoder. The whole model is arbitrarily a encoder decoder architecture. Low level feature would be saved during encoder.
  • Figure 2: Components of fusion model. We only use simple linear layer as adaptor to fuse feature from multiple domain.