Table of Contents
Fetching ...

RePLAy: Remove Projective LiDAR Depthmap Artifacts via Exploiting Epipolar Geometry

Shengjie Zhu, Girish Chandar Ganesan, Abhinav Kumar, Xiaoming Liu

TL;DR

The paper tackles persistent projective artifacts in LiDAR depthmaps caused by the baseline between LiDAR and RGB sensors. It introduces RePLAy, a parameter-free analytical method that builds a virtual LiDAR camera to form a binocular system with the RGB camera and detects artifacts as epipolar occlusion, aided by an auto-calibration step. Across datasets like KITTI, nuScenes, Waymo, and DDAD, RePLAy yields consistent improvements for state-of-the-art monocular depth estimation and monocular 3D object detection when depthmaps are artifact-free, and the authors release processed depthmaps to benefit the community. This work offers a practical, calibration-based solution that extends artifact removal to datasets lacking stereo imagery, enhancing depth supervision and downstream perception tasks in autonomous driving and related fields.

Abstract

3D sensing is a fundamental task for Autonomous Vehicles. Its deployment often relies on aligned RGB cameras and LiDAR. Despite meticulous synchronization and calibration, systematic misalignment persists in LiDAR projected depthmap. This is due to the physical baseline distance between the two sensors. The artifact is often reflected as background LiDAR incorrectly projected onto the foreground, such as cars and pedestrians. The KITTI dataset uses stereo cameras as a heuristic solution to remove artifacts. However most AV datasets, including nuScenes, Waymo, and DDAD, lack stereo images, making the KITTI solution inapplicable. We propose RePLAy, a parameter-free analytical solution to remove the projective artifacts. We construct a binocular vision system between a hypothesized virtual LiDAR camera and the RGB camera. We then remove the projective artifacts by determining the epipolar occlusion with the proposed analytical solution. We show unanimous improvement in the State-of-The-Art (SoTA) monocular depth estimators and 3D object detectors with the artifacts-free depthmaps.

RePLAy: Remove Projective LiDAR Depthmap Artifacts via Exploiting Epipolar Geometry

TL;DR

The paper tackles persistent projective artifacts in LiDAR depthmaps caused by the baseline between LiDAR and RGB sensors. It introduces RePLAy, a parameter-free analytical method that builds a virtual LiDAR camera to form a binocular system with the RGB camera and detects artifacts as epipolar occlusion, aided by an auto-calibration step. Across datasets like KITTI, nuScenes, Waymo, and DDAD, RePLAy yields consistent improvements for state-of-the-art monocular depth estimation and monocular 3D object detection when depthmaps are artifact-free, and the authors release processed depthmaps to benefit the community. This work offers a practical, calibration-based solution that extends artifact removal to datasets lacking stereo imagery, enhancing depth supervision and downstream perception tasks in autonomous driving and related fields.

Abstract

3D sensing is a fundamental task for Autonomous Vehicles. Its deployment often relies on aligned RGB cameras and LiDAR. Despite meticulous synchronization and calibration, systematic misalignment persists in LiDAR projected depthmap. This is due to the physical baseline distance between the two sensors. The artifact is often reflected as background LiDAR incorrectly projected onto the foreground, such as cars and pedestrians. The KITTI dataset uses stereo cameras as a heuristic solution to remove artifacts. However most AV datasets, including nuScenes, Waymo, and DDAD, lack stereo images, making the KITTI solution inapplicable. We propose RePLAy, a parameter-free analytical solution to remove the projective artifacts. We construct a binocular vision system between a hypothesized virtual LiDAR camera and the RGB camera. We then remove the projective artifacts by determining the epipolar occlusion with the proposed analytical solution. We show unanimous improvement in the State-of-The-Art (SoTA) monocular depth estimators and 3D object detectors with the artifacts-free depthmaps.
Paper Structure (23 sections, 3 theorems, 27 equations, 19 figures, 9 tables, 1 algorithm)

This paper contains 23 sections, 3 theorems, 27 equations, 19 figures, 9 tables, 1 algorithm.

Key Result

lemma thmcounterlemma

The depthmap of the virtual LiDAR camera is free of projective artifacts, i.e., no two scanned points $\mathbf{v}_1$ and $\mathbf{v}_2$ overlap after projection.

Figures (19)

  • Figure 1: Remove Projective LiDAR Depthmap Artifacts. The LiDAR projected depthmap has misalignment with the RGB camera. The background scans highlighted by blue arrows incorrectly overlay the foreground such as cars and pedestrians. The artifacts persist unattended in most AV datasets including KITTI-360 liao2022kitti360, nuScenes caesar2020nuscenes, Waymo sun2020scalability, and DDAD packnet. We propose an easy-to-use, parameter-free, and analytical solution to remove the projective artifacts, shown by green arrows.
  • Figure 2: LiDAR-RGB sensor sets have a physical baseline distance between the two sensors, resulting in \ref{['fig:teaser']} projective artifacts. (a) The background scanned by LiDAR incorrectly overlays on the foreground pedestrian observed by the RGB camera.
  • Figure 3: Qualitative comparison with KITTI semi-dense depthmap. RePLAy only requires the LiDAR calibration matrix. KITTI uhrig2017sparsity uses multi-frame fusion with the stereo camera to remove projective artifacts. The higher density of KITTI depthmap is from the additional sensor and temporal fusion.
  • Figure 4: Virtual LiDAR camera and RGB Binocular System. In (a), we plot the bird-eye view of the point cloud $\mathcal{V}$ from LiDAR. A binocular system is set between virtual LiDAR (epipolar left) and RGB camera (epipolar right). Subfigure (b) plots the depthmap $\mathbf{D}_l$ from LiDAR view. As stated in \ref{['collary_lidar']}, depthmap $\mathbf{D}_l$ does not have projective artifacts. Compare (b) and (c), the artifacts arise when transfer from LiDAR to RGB camera, due to the epipolar occlusion among the binocular system. We formally define the projective artifacts (or epipolar occlusion) in \ref{['eqn:def_occ']}.
  • Figure 5: Auto-Calibration. The LiDAR point cloud $\mathcal{V}$ in (a) is mapped to spherical coordinate $\theta - \phi$ with \ref{['eqn:spherecal']}. We use a binary matrix $\mathbf{S}$ to record the unique pixel locations $\mathcal{S}$ after rasterization. Plots (b) and (c) are matrix $\mathbf{S}$ before and after auto-calibration where duplication (marked by arrows) is minimized with \ref{['eqn:auto_loss']} loss $L$. Plots (d) and (e) contrast the virtual LiDAR depthmap $\mathbf{D}_l$ before and after auto-calibration where \ref{['collary_lidar']} is fulfilled.
  • ...and 14 more figures

Theorems & Definitions (6)

  • lemma thmcounterlemma
  • lemma thmcounterlemma
  • lemma thmcounterlemma
  • proof
  • proof
  • proof