Learning Hierarchical Color Guidance for Depth Map Super-Resolution
Runmin Cong, Ronghui Sheng, Hao Wu, Yulan Guo, Yunchao Wei, Wangmeng Zuo, Yao Zhao, Sam Kwong
TL;DR
This work tackles depth map super-resolution by rethinking color guidance and introducing a hierarchical color guidance network (HCGNet). It decouples color information into low-level detail guidance via a residual mask and high-level abstract guidance via a semantic mask, and fuses them through an attention-based feature projection (AFP) framework that includes multi-scale content enhancement (MCE) and adaptive attention projection (AAP). The low-level detail embedding (LDE) and high-level abstract guidance (HAG) modules enable progressive, semantically consistent depth reconstruction, achieving state-of-the-art results on four benchmark datasets with strong performance, especially at large upsampling factors. The approach is efficient, portable to other color-guided DSR models, and offers practical benefits for depth-aware applications in robotics and perception.
Abstract
Color information is the most commonly used prior knowledge for depth map super-resolution (DSR), which can provide high-frequency boundary guidance for detail restoration. However, its role and functionality in DSR have not been fully developed. In this paper, we rethink the utilization of color information and propose a hierarchical color guidance network to achieve DSR. On the one hand, the low-level detail embedding module is designed to supplement high-frequency color information of depth features in a residual mask manner at the low-level stages. On the other hand, the high-level abstract guidance module is proposed to maintain semantic consistency in the reconstruction process by using a semantic mask that encodes the global guidance information. The color information of these two dimensions plays a role in the front and back ends of the attention-based feature projection (AFP) module in a more comprehensive form. Simultaneously, the AFP module integrates the multi-scale content enhancement block and adaptive attention projection block to make full use of multi-scale information and adaptively project critical restoration information in an attention manner for DSR. Compared with the state-of-the-art methods on four benchmark datasets, our method achieves more competitive performance both qualitatively and quantitatively.
