Table of Contents
Fetching ...

DOC-Depth: A novel approach for dense depth ground truth generation

Simon de Moreau, Mathias Corsia, Hassan Bouchiba, Yasser Almehio, Andrei Bursuc, Hafid El-Idrissi, Fabien Moutarde

TL;DR

DOC-Depth tackles the challenge of obtaining fully dense, metric-depth ground truth in dynamic real-world scenes by introducing a learning-free, LiDAR-only approach. It aggregates LiDAR frames to form dense 3D reconstructions, classifies dynamic points with the DOC method, and renders dense depth maps from camera viewpoints using a novel composite rendering pipeline. Key contributions include the DOC dynamic-object classifier with ground segmentation and voting, a publicly released fully-dense KITTI annotation, and demonstrated generalizability across multiple LiDAR sensors and datasets. The approach enables scalable dense-depth dataset generation with strong geometric fidelity, improving downstream depth-estimation training and nighttime performance while reducing reliance on learning-based cross-domain generalization.

Abstract

Accurate depth information is essential for many computer vision applications. Yet, no available dataset recording method allows for fully dense accurate depth estimation in a large scale dynamic environment. In this paper, we introduce DOC-Depth, a novel, efficient and easy-to-deploy approach for dense depth generation from any LiDAR sensor. After reconstructing consistent dense 3D environment using LiDAR odometry, we address dynamic objects occlusions automatically thanks to DOC, our state-of-the art dynamic object classification method. Additionally, DOC-Depth is fast and scalable, allowing for the creation of unbounded datasets in terms of size and time. We demonstrate the effectiveness of our approach on the KITTI dataset, improving its density from 16.1% to 71.2% and release this new fully dense depth annotation, to facilitate future research in the domain. We also showcase results using various LiDAR sensors and in multiple environments. All software components are publicly available for the research community.

DOC-Depth: A novel approach for dense depth ground truth generation

TL;DR

DOC-Depth tackles the challenge of obtaining fully dense, metric-depth ground truth in dynamic real-world scenes by introducing a learning-free, LiDAR-only approach. It aggregates LiDAR frames to form dense 3D reconstructions, classifies dynamic points with the DOC method, and renders dense depth maps from camera viewpoints using a novel composite rendering pipeline. Key contributions include the DOC dynamic-object classifier with ground segmentation and voting, a publicly released fully-dense KITTI annotation, and demonstrated generalizability across multiple LiDAR sensors and datasets. The approach enables scalable dense-depth dataset generation with strong geometric fidelity, improving downstream depth-estimation training and nighttime performance while reducing reliance on learning-based cross-domain generalization.

Abstract

Accurate depth information is essential for many computer vision applications. Yet, no available dataset recording method allows for fully dense accurate depth estimation in a large scale dynamic environment. In this paper, we introduce DOC-Depth, a novel, efficient and easy-to-deploy approach for dense depth generation from any LiDAR sensor. After reconstructing consistent dense 3D environment using LiDAR odometry, we address dynamic objects occlusions automatically thanks to DOC, our state-of-the art dynamic object classification method. Additionally, DOC-Depth is fast and scalable, allowing for the creation of unbounded datasets in terms of size and time. We demonstrate the effectiveness of our approach on the KITTI dataset, improving its density from 16.1% to 71.2% and release this new fully dense depth annotation, to facilitate future research in the domain. We also showcase results using various LiDAR sensors and in multiple environments. All software components are publicly available for the research community.

Paper Structure

This paper contains 29 sections, 5 equations, 10 figures, 4 tables, 1 algorithm.

Figures (10)

  • Figure 1: DOC-Depth generates dense and accurate depth ground truth for training camera-based depth estimation systems. First, we aggregate LiDAR frames to obtain a 3D dense representation of the scene. Then, thanks to DOC, we classify dynamic points to handle them with specific rendering. Finally, we project the 3D reconstruction into the camera point of view, taking into account points distance and dynamic objects occlusions.
  • Figure 2: Qualitative results of DOC-Depth against the learning-based approach BP-Nettang2024bilateral trained on KITTI and tested on our datasets. While learning-method performances drop when tested out of its training domain, our method works across domains, with the same parameters. LiDAR used: 2x Hesai XT-32 (left) and Ouster OS1-128 (right)
  • Figure 3: Example of ground segmentation results on KITTI odometry sequence 07. Ground points (purple) and non-ground points (green) are accurately classified.
  • Figure 4: Illustration of the key frame selection method on a query frame (green dot). Colored triangles (respectively squares) correspond to the selected key frame in the coarsely (respectively finely) subsampled trajectory.
  • Figure 5: Example of cleaning on KITTI odometry sequence 07. The top image shows aggregated LiDAR frames classified with DOC. The bottom image shows the quality of mobile object removal using our method. No dynamic point near the ground or floating remains.
  • ...and 5 more figures