Table of Contents
Fetching ...

Completion as Enhancement: A Degradation-Aware Selective Image Guided Network for Depth Completion

Zhiqiang Yan, Zhengxue Wang, Kun Wang, Jun Li, Jian Yang

TL;DR

This work reframes depth completion as depth enhancement by first densifying sparse depth with non-CNN methods to obtain a coarse depth map and then learning an implicit degradation that links this coarse depth to the target dense depth. A Degradation-Aware Decomposition and Fusion (DADF) module decomposes the degradation in the frequency domain to selectively incorporate high-frequency RGB information, while a Conditional Mamba enables global RGB-D interaction aligned with degradation cues. The model is trained with a reconstruction loss plus a self-supervised degradation loss, encouraging accurate depth recovery and meaningful degradation representations. Across NYUv2, DIML, SUN RGB-D, and TOFDC, SigNet achieves state-of-the-art results with strong generalization and significantly reduced model size and inference time, indicating practical impact for robust RGB-D sensing and scene understanding. Limitations include reduced performance on extremely sparse outdoor KITTI data, suggesting potential gains from auxiliary dense-depth supervision or edge-aware priors.

Abstract

In this paper, we introduce the Selective Image Guided Network (SigNet), a novel degradation-aware framework that transforms depth completion into depth enhancement for the first time. Moving beyond direct completion using convolutional neural networks (CNNs), SigNet initially densifies sparse depth data through non-CNN densification tools to obtain coarse yet dense depth. This approach eliminates the mismatch and ambiguity caused by direct convolution over irregularly sampled sparse data. Subsequently, SigNet redefines completion as enhancement, establishing a self-supervised degradation bridge between the coarse depth and the targeted dense depth for effective RGB-D fusion. To achieve this, SigNet leverages the implicit degradation to adaptively select high-frequency components (e.g., edges) of RGB data to compensate for the coarse depth. This degradation is further integrated into a multi-modal conditional Mamba, dynamically generating the state parameters to enable efficient global high-frequency information interaction. We conduct extensive experiments on the NYUv2, DIML, SUN RGBD, and TOFDC datasets, demonstrating the state-of-the-art (SOTA) performance of SigNet.

Completion as Enhancement: A Degradation-Aware Selective Image Guided Network for Depth Completion

TL;DR

This work reframes depth completion as depth enhancement by first densifying sparse depth with non-CNN methods to obtain a coarse depth map and then learning an implicit degradation that links this coarse depth to the target dense depth. A Degradation-Aware Decomposition and Fusion (DADF) module decomposes the degradation in the frequency domain to selectively incorporate high-frequency RGB information, while a Conditional Mamba enables global RGB-D interaction aligned with degradation cues. The model is trained with a reconstruction loss plus a self-supervised degradation loss, encouraging accurate depth recovery and meaningful degradation representations. Across NYUv2, DIML, SUN RGB-D, and TOFDC, SigNet achieves state-of-the-art results with strong generalization and significantly reduced model size and inference time, indicating practical impact for robust RGB-D sensing and scene understanding. Limitations include reduced performance on extremely sparse outdoor KITTI data, suggesting potential gains from auxiliary dense-depth supervision or edge-aware priors.

Abstract

In this paper, we introduce the Selective Image Guided Network (SigNet), a novel degradation-aware framework that transforms depth completion into depth enhancement for the first time. Moving beyond direct completion using convolutional neural networks (CNNs), SigNet initially densifies sparse depth data through non-CNN densification tools to obtain coarse yet dense depth. This approach eliminates the mismatch and ambiguity caused by direct convolution over irregularly sampled sparse data. Subsequently, SigNet redefines completion as enhancement, establishing a self-supervised degradation bridge between the coarse depth and the targeted dense depth for effective RGB-D fusion. To achieve this, SigNet leverages the implicit degradation to adaptively select high-frequency components (e.g., edges) of RGB data to compensate for the coarse depth. This degradation is further integrated into a multi-modal conditional Mamba, dynamically generating the state parameters to enable efficient global high-frequency information interaction. We conduct extensive experiments on the NYUv2, DIML, SUN RGBD, and TOFDC datasets, demonstrating the state-of-the-art (SOTA) performance of SigNet.
Paper Structure (20 sections, 10 equations, 10 figures, 5 tables, 1 algorithm)

This paper contains 20 sections, 10 equations, 10 figures, 5 tables, 1 algorithm.

Figures (10)

  • Figure 1: Illustration of our main concept. It redefines depth completion as depth enhancement. We first transform sparse depth input via non-CNN densification tools ku2018ipbasiclevin2004colorization, yielding coarse but dense depth. Then we predict the precise and dense depth from the coarse depth by leveraging degradation assumption zhong2023guidedwang2021unsupervisedzhang2021designing.
  • Figure 2: Pipeline of SigNet. The sparse depth data is initially filled to create a coarse depth map. We then utilize the degradation assumption in \ref{['eq_degradation']} to establish a connection between this coarse map and the target prediction. The color image features and degradation representations are employed to guide the multi-modal fusion through our proposed DADF module, as depicted in Fig. \ref{['fig_dadf']}.
  • Figure 3: Overview of our DADF. $\odot$ and CM refer to element-wise multiplication and the conditional Mamba, respectively.
  • Figure 4: Visual comparison with SOTA methods on TOFDC dataset, including NLSPN park2020nonlocal, RigNet yan2022rignet, and our SigNet.
  • Figure 5: Visual comparison with SOTA approaches on DIML dataset, including CSPN 2018Learning, AGG-Net chen2023agg, and our SigNet.
  • ...and 5 more figures