Table of Contents
Fetching ...

Semantic-Guided Global-Local Collaborative Networks for Lightweight Image Super-Resolution

Wanshu Fan, Yue Wang, Cong Wang, Yunzhe Zhang, Wei Wang, Dongsheng Zhou

TL;DR

This work tackles the challenge of high-quality, efficient image super-resolution for measurement systems by proposing SGGLC-Net, a lightweight SR framework that injects semantic priors from a pre-trained model via a Semantic Guidance Module. It introduces a Global-Local Collaborative Module composed of three GLDEBs and a Hybrid Attention Block to fuse local details with global context, enabling accurate texture and edge restoration at reduced complexity. Extensive experiments on DIV2K-derived training, synthetic benchmarks, and RealSR demonstrate competitive PSNR/SSIM with fewer parameters and lower multi-adds than state-of-the-art lightweight SR methods, underscoring potential gains in precision for vision-based measurement and detection tasks. The approach offers practical impact for measurement pipelines by improving image quality without prohibitive computational costs, with code availability for broader adoption and extension.

Abstract

Single-Image Super-Resolution (SISR) plays a pivotal role in enhancing the accuracy and reliability of measurement systems, which are integral to various vision-based instrumentation and measurement applications. These systems often require clear and detailed images for precise object detection and recognition. However, images captured by visual measurement tools frequently suffer from degradation, including blurring and loss of detail, which can impede measurement accuracy.As a potential remedy, we in this paper propose a Semantic-Guided Global-Local Collaborative Network (SGGLC-Net) for lightweight SISR. Our SGGLC-Net leverages semantic priors extracted from a pre-trained model to guide the super-resolution process, enhancing image detail quality effectively. Specifically,we propose a Semantic Guidance Module that seamlessly integrates the semantic priors into the super-resolution network, enabling the network to more adeptly capture and utilize semantic priors, thereby enhancing image details. To further explore both local and non-local interactions for improved detail rendition,we propose a Global-Local Collaborative Module, which features three Global and Local Detail Enhancement Modules, as well as a Hybrid Attention Mechanism to work together to efficiently learn more useful features. Our extensive experiments show that SGGLC-Net achieves competitive PSNR and SSIM values across multiple benchmark datasets, demonstrating higher performance with the multi-adds reduction of 12.81G compared to state-of-the-art lightweight super-resolution approaches. These improvements underscore the potential of our approach to enhance the precision and effectiveness of visual measurement systems. Codes are at https://github.com/fanamber831/SGGLC-Net.

Semantic-Guided Global-Local Collaborative Networks for Lightweight Image Super-Resolution

TL;DR

This work tackles the challenge of high-quality, efficient image super-resolution for measurement systems by proposing SGGLC-Net, a lightweight SR framework that injects semantic priors from a pre-trained model via a Semantic Guidance Module. It introduces a Global-Local Collaborative Module composed of three GLDEBs and a Hybrid Attention Block to fuse local details with global context, enabling accurate texture and edge restoration at reduced complexity. Extensive experiments on DIV2K-derived training, synthetic benchmarks, and RealSR demonstrate competitive PSNR/SSIM with fewer parameters and lower multi-adds than state-of-the-art lightweight SR methods, underscoring potential gains in precision for vision-based measurement and detection tasks. The approach offers practical impact for measurement pipelines by improving image quality without prohibitive computational costs, with code availability for broader adoption and extension.

Abstract

Single-Image Super-Resolution (SISR) plays a pivotal role in enhancing the accuracy and reliability of measurement systems, which are integral to various vision-based instrumentation and measurement applications. These systems often require clear and detailed images for precise object detection and recognition. However, images captured by visual measurement tools frequently suffer from degradation, including blurring and loss of detail, which can impede measurement accuracy.As a potential remedy, we in this paper propose a Semantic-Guided Global-Local Collaborative Network (SGGLC-Net) for lightweight SISR. Our SGGLC-Net leverages semantic priors extracted from a pre-trained model to guide the super-resolution process, enhancing image detail quality effectively. Specifically,we propose a Semantic Guidance Module that seamlessly integrates the semantic priors into the super-resolution network, enabling the network to more adeptly capture and utilize semantic priors, thereby enhancing image details. To further explore both local and non-local interactions for improved detail rendition,we propose a Global-Local Collaborative Module, which features three Global and Local Detail Enhancement Modules, as well as a Hybrid Attention Mechanism to work together to efficiently learn more useful features. Our extensive experiments show that SGGLC-Net achieves competitive PSNR and SSIM values across multiple benchmark datasets, demonstrating higher performance with the multi-adds reduction of 12.81G compared to state-of-the-art lightweight super-resolution approaches. These improvements underscore the potential of our approach to enhance the precision and effectiveness of visual measurement systems. Codes are at https://github.com/fanamber831/SGGLC-Net.

Paper Structure

This paper contains 32 sections, 6 equations, 14 figures, 9 tables.

Figures (14)

  • Figure 1: Model multi-adds comparison on Urban100 huang2015single ($\times$4), where the output image size is 1280$\times$720. Our SGGLC-Net family achieves a better trade-off between model complexity and super-resolution performance.
  • Figure 2: Challenging super-resolution example. (a) and (b) are the HR image and HR patch. (c) is the result restored by bicubic interpolation. (d) and (e) are the restoration results of CARN ahn2018fast and LBNet DBLP:conf/ijcai/GaoW0L0Z22, which do not use semantic priors guidance. (f) and (g) are the restoration results of our SGGLC-Net and SGGLC-Net-L, which effectively exploit semantic priors guidance. Our SGGLC-Net family is able to recover better details compared with previous state-of-the-art methods.
  • Figure 3: The overall architecture of the proposed Semantic-Guided Global-Local Collaborative Network (SGGLC-Net). Our SGGLC-Net consists of four major components: Semantic Guided Module, Shallow Feature Extraction Module, Deep Feature Extraction Module, and Reconstruction Module. Given a low-resolution (LR) input $I_{LR}$, we first use a 3$\times$3 convolutional layer to extract shallow features. Then, we use a series of Global-Local Collaborative Modules (GLCM), as shown in Fig. \ref{['fig:p4']}, as the Deep Feature Extraction Module to extract deeper features. Meanwhile, we also employ the Semantic Guidance Module (SGM), as shown in Fig. \ref{['fig:p3']}, to effectively exploit the useful semantic priors obtained from the pre-trained extractor VGG19 VGG to guide the process of deep feature extraction. Finally, we use a Reconstruction Module, which consists of a 3$\times$3 convolutional layer and a pixel shuffle layer to up-sample the fused feature to generate the super-resolution image $I_{SR}$.
  • Figure 4: The architecture of the Semantic Guidance Module (SGM). The SGM effectively utilizes the semantic priors extracted from a pre-trained VGG19 network VGG to guide the learning of deep feature extraction of the super-resolution backbone. Note that $F_{vi}^{'}$ is the input of the next GLCM while the $P_{i+1}$ serves as one of the inputs of the next SGM.
  • Figure 5: (a) The architecture of the proposed Global-Local Collaborative Module (GLCM). Our GLCM consists of 3 Global-Local Detail Enhancement Blocks (GLDEB) to interact with the local and non-local information to better super-resolve images, and a Hybrid Attention Block (HAB) to dynamically select more useful features for better image super-resolution. (b) The architecture of the local feature extraction. (c) The structure of the global feature extraction.
  • ...and 9 more figures