Table of Contents
Fetching ...

Fully $1\times1$ Convolutional Network for Lightweight Image Super-Resolution

Gang Wu, Junjun Jiang, Kui Jiang, Xianming Liu

TL;DR

Paper addresses efficient SISR by designing a fully $1×1$ convolutional network (SCNet) that uses a parameter-free spatial-shift to provide local aggregation. It introduces Shift-Conv (SC) layer and SC-ResBlock, stacks them into SCNet variants (T, B, L), and shows competitive performance with fewer parameters on Manga109 and other benchmarks, including comparisons to models using $3×3$ convolutions. The work demonstrates that local feature aggregation can be achieved without $3×3$ convolutions, enabling lightweight models with real-time potential. Ablation and scalability analyses highlight the impact of shift patterns, model capacity, and upscaling choices, establishing SCNet as a versatile baseline for ultra-efficient SISR.

Abstract

Deep models have achieved significant process on single image super-resolution (SISR) tasks, in particular large models with large kernel ($3\times3$ or more). However, the heavy computational footprint of such models prevents their deployment in real-time, resource-constrained environments. Conversely, $1\times1$ convolutions bring substantial computational efficiency, but struggle with aggregating local spatial representations, an essential capability to SISR models. In response to this dichotomy, we propose to harmonize the merits of both $3\times3$ and $1\times1$ kernels, and exploit a great potential for lightweight SISR tasks. Specifically, we propose a simple yet effective fully $1\times1$ convolutional network, named Shift-Conv-based Network (SCNet). By incorporating a parameter-free spatial-shift operation, it equips the fully $1\times1$ convolutional network with powerful representation capability while impressive computational efficiency. Extensive experiments demonstrate that SCNets, despite its fully $1\times1$ convolutional structure, consistently matches or even surpasses the performance of existing lightweight SR models that employ regular convolutions. The code and pre-trained models can be found at https://github.com/Aitical/SCNet.

Fully $1\times1$ Convolutional Network for Lightweight Image Super-Resolution

TL;DR

Paper addresses efficient SISR by designing a fully convolutional network (SCNet) that uses a parameter-free spatial-shift to provide local aggregation. It introduces Shift-Conv (SC) layer and SC-ResBlock, stacks them into SCNet variants (T, B, L), and shows competitive performance with fewer parameters on Manga109 and other benchmarks, including comparisons to models using convolutions. The work demonstrates that local feature aggregation can be achieved without convolutions, enabling lightweight models with real-time potential. Ablation and scalability analyses highlight the impact of shift patterns, model capacity, and upscaling choices, establishing SCNet as a versatile baseline for ultra-efficient SISR.

Abstract

Deep models have achieved significant process on single image super-resolution (SISR) tasks, in particular large models with large kernel ( or more). However, the heavy computational footprint of such models prevents their deployment in real-time, resource-constrained environments. Conversely, convolutions bring substantial computational efficiency, but struggle with aggregating local spatial representations, an essential capability to SISR models. In response to this dichotomy, we propose to harmonize the merits of both and kernels, and exploit a great potential for lightweight SISR tasks. Specifically, we propose a simple yet effective fully convolutional network, named Shift-Conv-based Network (SCNet). By incorporating a parameter-free spatial-shift operation, it equips the fully convolutional network with powerful representation capability while impressive computational efficiency. Extensive experiments demonstrate that SCNets, despite its fully convolutional structure, consistently matches or even surpasses the performance of existing lightweight SR models that employ regular convolutions. The code and pre-trained models can be found at https://github.com/Aitical/SCNet.
Paper Structure (12 sections, 16 figures, 8 tables, 1 algorithm)

This paper contains 12 sections, 16 figures, 8 tables, 1 algorithm.

Figures (16)

  • Figure 1: PSNR vs. Parameters. Comparisons with most recent efficient SISR models on Manga109 ($\times4$) test dataset.
  • Figure 2: The architecture of the proposed SCNet which is simply stacked by numerous basic residual blocks.
  • Figure 3: Illustration of the spatial-shift operation, covering eight local regions. By rearranging the spatial positions of feature maps, spatial-shift operation enhances local spatial feature aggregation across channel groups without additional computational costs.
  • Figure 4: Side-by-side comparison of the basic ResBlock and our proposed SC-ResBlock. The proposed SC-ResBlock substantially reduces the complexity with fully $1\times1$ convolutions, while effectively aggregating local features by spatial-shift operation.
  • Figure 5: Visual comparisons on images with fine details on Urban100 test dataset (Zoom in for more details).
  • ...and 11 more figures