Table of Contents
Fetching ...

DEPTHOR++: Robust Depth Enhancement from a Real-World Lightweight dToF and RGB Guidance

Jijun Xiang, Longliang Liu, Xuan Zhu, Xianqi Wang, Min Lin, Xin Yang

TL;DR

DEPTHOR++ tackles real-world depth enhancement from lightweight dToF sensors by acknowledging calibration errors and depth anomalies in RGB-dToF data. It unites a noise-robust training strategy with dToF simulation, a zero-learnable anomaly detector for sparse depth points, and a depth completion network that incorporates monocular depth priors and mixed- affinity refinement to produce accurate dense depth maps. The approach achieves state-of-the-art performance on ZJU-L5 and real-world datasets, with notable gains in Rel and RMSE, and demonstrates strong improvements in challenging mirror regions and when simulated low-cost sensors are used. The work offers a practical, generalizable path toward robust depth enhancement for consumer devices and real-world applications like 3D reconstruction and SLAM.

Abstract

Depth enhancement, which converts raw dToF signals into dense depth maps using RGB guidance, is crucial for improving depth perception in high-precision tasks such as 3D reconstruction and SLAM. However, existing methods often assume ideal dToF inputs and perfect dToF-RGB alignment, overlooking calibration errors and anomalies, thus limiting real-world applicability. This work systematically analyzes the noise characteristics of real-world lightweight dToF sensors and proposes a practical and novel depth completion framework, DEPTHOR++, which enhances robustness to noisy dToF inputs from three key aspects. First, we introduce a simulation method based on synthetic datasets to generate realistic training samples for robust model training. Second, we propose a learnable-parameter-free anomaly detection mechanism to identify and remove erroneous dToF measurements, preventing misleading propagation during completion. Third, we design a depth completion network tailored to noisy dToF inputs, which integrates RGB images and pre-trained monocular depth estimation priors to improve depth recovery in challenging regions. On the ZJU-L5 dataset and real-world samples, our training strategy significantly boosts existing depth completion models, with our model achieving state-of-the-art performance, improving RMSE and Rel by 22% and 11% on average. On the Mirror3D-NYU dataset, by incorporating the anomaly detection method, our model improves upon the previous SOTA by 37% in mirror regions. On the Hammer dataset, using simulated low-cost dToF data from RealSense L515, our method surpasses the L515 measurements with an average gain of 22%, demonstrating its potential to enable low-cost sensors to outperform higher-end devices. Qualitative results across diverse real-world datasets further validate the effectiveness and generalizability of our approach.

DEPTHOR++: Robust Depth Enhancement from a Real-World Lightweight dToF and RGB Guidance

TL;DR

DEPTHOR++ tackles real-world depth enhancement from lightweight dToF sensors by acknowledging calibration errors and depth anomalies in RGB-dToF data. It unites a noise-robust training strategy with dToF simulation, a zero-learnable anomaly detector for sparse depth points, and a depth completion network that incorporates monocular depth priors and mixed- affinity refinement to produce accurate dense depth maps. The approach achieves state-of-the-art performance on ZJU-L5 and real-world datasets, with notable gains in Rel and RMSE, and demonstrates strong improvements in challenging mirror regions and when simulated low-cost sensors are used. The work offers a practical, generalizable path toward robust depth enhancement for consumer devices and real-world applications like 3D reconstruction and SLAM.

Abstract

Depth enhancement, which converts raw dToF signals into dense depth maps using RGB guidance, is crucial for improving depth perception in high-precision tasks such as 3D reconstruction and SLAM. However, existing methods often assume ideal dToF inputs and perfect dToF-RGB alignment, overlooking calibration errors and anomalies, thus limiting real-world applicability. This work systematically analyzes the noise characteristics of real-world lightweight dToF sensors and proposes a practical and novel depth completion framework, DEPTHOR++, which enhances robustness to noisy dToF inputs from three key aspects. First, we introduce a simulation method based on synthetic datasets to generate realistic training samples for robust model training. Second, we propose a learnable-parameter-free anomaly detection mechanism to identify and remove erroneous dToF measurements, preventing misleading propagation during completion. Third, we design a depth completion network tailored to noisy dToF inputs, which integrates RGB images and pre-trained monocular depth estimation priors to improve depth recovery in challenging regions. On the ZJU-L5 dataset and real-world samples, our training strategy significantly boosts existing depth completion models, with our model achieving state-of-the-art performance, improving RMSE and Rel by 22% and 11% on average. On the Mirror3D-NYU dataset, by incorporating the anomaly detection method, our model improves upon the previous SOTA by 37% in mirror regions. On the Hammer dataset, using simulated low-cost dToF data from RealSense L515, our method surpasses the L515 measurements with an average gain of 22%, demonstrating its potential to enable low-cost sensors to outperform higher-end devices. Qualitative results across diverse real-world datasets further validate the effectiveness and generalizability of our approach.

Paper Structure

This paper contains 20 sections, 12 equations, 16 figures, 13 tables.

Figures (16)

  • Figure 1: Effect of our training strategy and our depth completion model (without anomaly detection). From left to right are: RGB-dToF, predictions of a lightweight PENet penet, the same PENet with our training strategy, and our model with our training strategy. Our training strategy improves the performance of existing methods on real-world data. Our model further enhances predictions in challenging regions.
  • Figure 2: Left: Overview of the anomaly detection method. Right: Results of combining the detection method with our depth completion model, the input dToF points are sampled from ground truth collected by high-precision sensors.
  • Figure 3: Imaging principle of direct Time-of-Flight sensor
  • Figure 4: Ideal and anomalous real-world RGB-dToF samples we collected.
  • Figure 5: RGB and depth GT of existing real-world datasets. The red arrows indicate unreliable measurements in challenging regions.
  • ...and 11 more figures