Table of Contents
Fetching ...

Learning Hierarchical Color Guidance for Depth Map Super-Resolution

Runmin Cong, Ronghui Sheng, Hao Wu, Yulan Guo, Yunchao Wei, Wangmeng Zuo, Yao Zhao, Sam Kwong

TL;DR

This work tackles depth map super-resolution by rethinking color guidance and introducing a hierarchical color guidance network (HCGNet). It decouples color information into low-level detail guidance via a residual mask and high-level abstract guidance via a semantic mask, and fuses them through an attention-based feature projection (AFP) framework that includes multi-scale content enhancement (MCE) and adaptive attention projection (AAP). The low-level detail embedding (LDE) and high-level abstract guidance (HAG) modules enable progressive, semantically consistent depth reconstruction, achieving state-of-the-art results on four benchmark datasets with strong performance, especially at large upsampling factors. The approach is efficient, portable to other color-guided DSR models, and offers practical benefits for depth-aware applications in robotics and perception.

Abstract

Color information is the most commonly used prior knowledge for depth map super-resolution (DSR), which can provide high-frequency boundary guidance for detail restoration. However, its role and functionality in DSR have not been fully developed. In this paper, we rethink the utilization of color information and propose a hierarchical color guidance network to achieve DSR. On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features in a residual mask manner at the low-level stages. On the other hand, the high-level abstract guidance module is proposed to maintain semantic consistency in the reconstruction process by using a semantic mask that encodes the global guidance information. The color information of these two dimensions plays a role in the front and back ends of the attention-based feature projection (AFP) module in a more comprehensive form. Simultaneously, the AFP module integrates the multi-scale content enhancement block and adaptive attention projection block to make full use of multi-scale information and adaptively project critical restoration information in an attention manner for DSR. Compared with the state-of-the-art methods on four benchmark datasets, our method achieves more competitive performance both qualitatively and quantitatively.

Learning Hierarchical Color Guidance for Depth Map Super-Resolution

TL;DR

This work tackles depth map super-resolution by rethinking color guidance and introducing a hierarchical color guidance network (HCGNet). It decouples color information into low-level detail guidance via a residual mask and high-level abstract guidance via a semantic mask, and fuses them through an attention-based feature projection (AFP) framework that includes multi-scale content enhancement (MCE) and adaptive attention projection (AAP). The low-level detail embedding (LDE) and high-level abstract guidance (HAG) modules enable progressive, semantically consistent depth reconstruction, achieving state-of-the-art results on four benchmark datasets with strong performance, especially at large upsampling factors. The approach is efficient, portable to other color-guided DSR models, and offers practical benefits for depth-aware applications in robotics and perception.

Abstract

Color information is the most commonly used prior knowledge for depth map super-resolution (DSR), which can provide high-frequency boundary guidance for detail restoration. However, its role and functionality in DSR have not been fully developed. In this paper, we rethink the utilization of color information and propose a hierarchical color guidance network to achieve DSR. On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features in a residual mask manner at the low-level stages. On the other hand, the high-level abstract guidance module is proposed to maintain semantic consistency in the reconstruction process by using a semantic mask that encodes the global guidance information. The color information of these two dimensions plays a role in the front and back ends of the attention-based feature projection (AFP) module in a more comprehensive form. Simultaneously, the AFP module integrates the multi-scale content enhancement block and adaptive attention projection block to make full use of multi-scale information and adaptively project critical restoration information in an attention manner for DSR. Compared with the state-of-the-art methods on four benchmark datasets, our method achieves more competitive performance both qualitatively and quantitatively.
Paper Structure (17 sections, 16 equations, 12 figures, 8 tables)

This paper contains 17 sections, 16 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: Illustration of the color guidance in DSR. Mode (a) only utilizes the low-level color information to guide the reconstruction of detail information; Mode (b) treats different levels of color information indiscriminately; Mode (c) represents our guidance model, which divides color information into two parts, i.e., low-level and high-level information, and allows them to play different roles.
  • Figure 2: The architecture of HCGNet. The LR depth map and HR color image are first embedded into the feature extraction unit to extract multi-level features. Then, the Attention-based Feature Projection (AFP) module, Low-level Detail Embedding (LDE) module, and High-level Abstract Guidance (HAG) module work together to gradually recover details in LR depth features and generate the HR depth map. The use of color information is manifested in two aspects. On the one hand, the low-level color features are used in the low-level reconstruction stage to restore details through the LDE module. On the other hand, the high-level abstract features are used at the end of the AFP module to provide semantic guidance through the HAG module.
  • Figure 3: The whole architecture of AFP module and details of sub-blocks, i.e., MCE block and AAP block.
  • Figure 4: The architecture of Low-level Detail Embedding Module.
  • Figure 5: The architecture of High-level Abstract Guidance Module.
  • ...and 7 more figures