Table of Contents
Fetching ...

Towards Domain-agnostic Depth Completion

Guangkai Xu, Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Jia-Wang Bian

TL;DR

This work addresses depth completion across diverse sensors and sparsity patterns, a setting where prior methods falter due to domain-specific training. It proposes a robust, domain-agnostic approach that fuses RGB images, sparse depth, and a guidance map derived from a pretrained single-image depth predictor, aided by training with diversified synthetic sparsity patterns. Two new benchmarks, GeneralSparsity and NoisySparsity, are introduced to evaluate cross-domain generalization and robustness to noisy inputs. Experiments show competitive performance on outdoor/indoor datasets and hardware-sourced depths, illustrating strong cross-domain generalization and resilience to sensor noise with a single trained model, enabling practical mobile-depth capture.

Abstract

Existing depth completion methods are often targeted at a specific sparse depth type and generalize poorly across task domains. We present a method to complete sparse/semi-dense, noisy, and potentially low-resolution depth maps obtained by various range sensors, including those in modern mobile phones, or by multi-view reconstruction algorithms. Our method leverages a data-driven prior in the form of a single image depth prediction network trained on large-scale datasets, the output of which is used as an input to our model. We propose an effective training scheme where we simulate various sparsity patterns in typical task domains. In addition, we design two new benchmarks to evaluate the generalizability and the robustness of depth completion methods. Our simple method shows superior cross-domain generalization ability against state-of-the-art depth completion methods, introducing a practical solution to high-quality depth capture on a mobile device. The code is available at: https://github.com/YvanYin/FillDepth.

Towards Domain-agnostic Depth Completion

TL;DR

This work addresses depth completion across diverse sensors and sparsity patterns, a setting where prior methods falter due to domain-specific training. It proposes a robust, domain-agnostic approach that fuses RGB images, sparse depth, and a guidance map derived from a pretrained single-image depth predictor, aided by training with diversified synthetic sparsity patterns. Two new benchmarks, GeneralSparsity and NoisySparsity, are introduced to evaluate cross-domain generalization and robustness to noisy inputs. Experiments show competitive performance on outdoor/indoor datasets and hardware-sourced depths, illustrating strong cross-domain generalization and resilience to sensor noise with a single trained model, enabling practical mobile-depth capture.

Abstract

Existing depth completion methods are often targeted at a specific sparse depth type and generalize poorly across task domains. We present a method to complete sparse/semi-dense, noisy, and potentially low-resolution depth maps obtained by various range sensors, including those in modern mobile phones, or by multi-view reconstruction algorithms. Our method leverages a data-driven prior in the form of a single image depth prediction network trained on large-scale datasets, the output of which is used as an input to our model. We propose an effective training scheme where we simulate various sparsity patterns in typical task domains. In addition, we design two new benchmarks to evaluate the generalizability and the robustness of depth completion methods. Our simple method shows superior cross-domain generalization ability against state-of-the-art depth completion methods, introducing a practical solution to high-quality depth capture on a mobile device. The code is available at: https://github.com/YvanYin/FillDepth.
Paper Structure (10 sections, 1 equation, 23 figures, 7 tables)

This paper contains 10 sections, 1 equation, 23 figures, 7 tables.

Figures (23)

  • Figure 1: Our method fills in missing information in different types of sparse depth maps. A single model can be used to complete the sparse depth from different methods, e.g. Huawei Mate30 Time-of-Flight sensor (top) and a multi-view stereo algorithm schops2017multi (bottom).
  • Figure 2: Robustness analysis. We analyze the performance of CSPN cheng2020cspn++ (completion) and Senushkin et al.senushkin2020decoder (inpainting) in terms of input point numbers/patterns (a, c) and outlier ratios (b, d). CSPN is trained on NYU silberman2012indoor, and we evaluate it on both NYU and ScanNet dai2017scannet. Senushkin et al. is trained and evaluated on Matterport3D Matterport3D.
  • Figure 3: Our method takes an RGB image, sparse depth, and guidance map as input, and it outputs a dense completed depth.
  • Figure 4: Visualization of sampled sparse depths. We simulate three different patterns from (a) the dense depth for training models: (b) random uniform sampling, (c) feature point based sampling, and (d) region-based sampling.
  • Figure 5: RGBD capture using a Huawei phone and its up-projected sparse depth map. The depth and the RGB sensor have the same field of view but the resolution is different. RGB resolution is in $1280\times 960$ but for depth, it is in $240\times180$. The "Up-projected Sparse Depth" means upsampling the raw depth map to the larger RGB image resolution, and this process will result in the sparsity of the depth map. See the "Zoomed-in Region" for better visualization.
  • ...and 18 more figures