Table of Contents
Fetching ...

SGNet: Structure Guided Network via Gradient-Frequency Awareness for Depth Map Super-Resolution

Zhengxue Wang, Zhiqiang Yan, Jian Yang

TL;DR

SGNet addresses depth map super-resolution by transferring high-frequency cues from RGB through two dedicated modules: Gradient Calibration Module (GCM) and Frequency Awareness Module (FAM). It introduces gradient- and frequency-domain losses to enforce structure fidelity in both gradient and spectral spaces. Empirical results across NYU-v2, Middlebury, Lu, and RGB-D-D show state-of-the-art performance and strong generalization, with ablation studies confirming the contributions of GCM, FAM, and the chosen loss components. The approach yields sharper edges and more accurate depth details, highlighting the practical impact of incorporating gradient and frequency information into RGB-guided DSR.

Abstract

Depth super-resolution (DSR) aims to restore high-resolution (HR) depth from low-resolution (LR) one, where RGB image is often used to promote this task. Recent image guided DSR approaches mainly focus on spatial domain to rebuild depth structure. However, since the structure of LR depth is usually blurry, only considering spatial domain is not very sufficient to acquire satisfactory results. In this paper, we propose structure guided network (SGNet), a method that pays more attention to gradient and frequency domains, both of which have the inherent ability to capture high-frequency structure. Specifically, we first introduce the gradient calibration module (GCM), which employs the accurate gradient prior of RGB to sharpen the LR depth structure. Then we present the Frequency Awareness Module (FAM) that recursively conducts multiple spectrum differencing blocks (SDB), each of which propagates the precise high-frequency components of RGB into the LR depth. Extensive experimental results on both real and synthetic datasets demonstrate the superiority of our SGNet, reaching the state-of-the-art. Codes and pre-trained models are available at https://github.com/yanzq95/SGNet.

SGNet: Structure Guided Network via Gradient-Frequency Awareness for Depth Map Super-Resolution

TL;DR

SGNet addresses depth map super-resolution by transferring high-frequency cues from RGB through two dedicated modules: Gradient Calibration Module (GCM) and Frequency Awareness Module (FAM). It introduces gradient- and frequency-domain losses to enforce structure fidelity in both gradient and spectral spaces. Empirical results across NYU-v2, Middlebury, Lu, and RGB-D-D show state-of-the-art performance and strong generalization, with ablation studies confirming the contributions of GCM, FAM, and the chosen loss components. The approach yields sharper edges and more accurate depth details, highlighting the practical impact of incorporating gradient and frequency information into RGB-guided DSR.

Abstract

Depth super-resolution (DSR) aims to restore high-resolution (HR) depth from low-resolution (LR) one, where RGB image is often used to promote this task. Recent image guided DSR approaches mainly focus on spatial domain to rebuild depth structure. However, since the structure of LR depth is usually blurry, only considering spatial domain is not very sufficient to acquire satisfactory results. In this paper, we propose structure guided network (SGNet), a method that pays more attention to gradient and frequency domains, both of which have the inherent ability to capture high-frequency structure. Specifically, we first introduce the gradient calibration module (GCM), which employs the accurate gradient prior of RGB to sharpen the LR depth structure. Then we present the Frequency Awareness Module (FAM) that recursively conducts multiple spectrum differencing blocks (SDB), each of which propagates the precise high-frequency components of RGB into the LR depth. Extensive experimental results on both real and synthetic datasets demonstrate the superiority of our SGNet, reaching the state-of-the-art. Codes and pre-trained models are available at https://github.com/yanzq95/SGNet.
Paper Structure (24 sections, 15 equations, 12 figures, 5 tables)

This paper contains 24 sections, 15 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: RMSE comparison between our SGNet and existing state-of-the-art methods on four benchmarks ( $\times 16$ ).
  • Figure 2: Visualizations of (c)-(f) Gradient features and (h)-(l) Spectrum features, where $\ominus$ refers to subtraction.
  • Figure 3: Overview of our Structure Guided Network (SGNet). Given $I_{rgb}$ and $D_{lr}^{up}$ as input, the Gradient Calibration Module (GCM) first maps them into gradient domain, producing $F_{ge}$ with sharp depth structure. Then, $I_{rgb}$, $D_{lr}$ and $F_{ge}$ are fed into the Frequency Awareness Module (FAM) to estimate frequency enhanced depth feature $D_{fe}$ via recursive Spectrum Differencing Blocks (SDB). $\uparrow$: bicubic up-sample. Grad. Mapping: Gradient Mapping. Freq. Mapping: Frequency Mapping.
  • Figure 4: Visualization of (a)-(b) gradient features and (c)-(d) depth features on Middlebury dataset.
  • Figure 5: Spectrum differencing block (SDB). Green dashed box: $1\times 1$ convolution. Gray rectangular box: a $1\times 1$ convolution and an invertible neural network zhou2022pan.
  • ...and 7 more figures