Table of Contents
Fetching ...

Efficient Image Super-Resolution with Feature Interaction Weighted Hybrid Network

Wenjie Li, Juncheng Li, Guangwei Gao, Weihong Deng, Jian Yang, Guo-Jun Qi, Chia-Wen Lin

TL;DR

This work proposes a Feature Interaction Weighted Hybrid Network (FIWHN), which comprises a series of Wide-residual Distillation Interaction Block (WDIB) as the backbone, and incorporates a Transformer and explores a novel architecture to combine CNN and Transformer.

Abstract

Lightweight image super-resolution aims to reconstruct high-resolution images from low-resolution images using low computational costs. However, existing methods result in the loss of middle-layer features due to activation functions. To minimize the impact of intermediate feature loss on reconstruction quality, we propose a Feature Interaction Weighted Hybrid Network (FIWHN), which comprises a series of Wide-residual Distillation Interaction Block (WDIB) as the backbone. Every third WDIB forms a Feature Shuffle Weighted Group (FSWG) by applying mutual information shuffle and fusion. Moreover, to mitigate the negative effects of intermediate feature loss, we introduce Wide Residual Weighting units within WDIB. These units effectively fuse features of varying levels of detail through a Wide-residual Distillation Connection (WRDC) and a Self-Calibrating Fusion (SCF). To compensate for global feature deficiencies, we incorporate a Transformer and explore a novel architecture to combine CNN and Transformer. We show that our FIWHN achieves a favorable balance between performance and efficiency through extensive experiments on low-level and high-level tasks. Codes will be available at \url{https://github.com/IVIPLab/FIWHN}.

Efficient Image Super-Resolution with Feature Interaction Weighted Hybrid Network

TL;DR

This work proposes a Feature Interaction Weighted Hybrid Network (FIWHN), which comprises a series of Wide-residual Distillation Interaction Block (WDIB) as the backbone, and incorporates a Transformer and explores a novel architecture to combine CNN and Transformer.

Abstract

Lightweight image super-resolution aims to reconstruct high-resolution images from low-resolution images using low computational costs. However, existing methods result in the loss of middle-layer features due to activation functions. To minimize the impact of intermediate feature loss on reconstruction quality, we propose a Feature Interaction Weighted Hybrid Network (FIWHN), which comprises a series of Wide-residual Distillation Interaction Block (WDIB) as the backbone. Every third WDIB forms a Feature Shuffle Weighted Group (FSWG) by applying mutual information shuffle and fusion. Moreover, to mitigate the negative effects of intermediate feature loss, we introduce Wide Residual Weighting units within WDIB. These units effectively fuse features of varying levels of detail through a Wide-residual Distillation Connection (WRDC) and a Self-Calibrating Fusion (SCF). To compensate for global feature deficiencies, we incorporate a Transformer and explore a novel architecture to combine CNN and Transformer. We show that our FIWHN achieves a favorable balance between performance and efficiency through extensive experiments on low-level and high-level tasks. Codes will be available at \url{https://github.com/IVIPLab/FIWHN}.
Paper Structure (16 sections, 28 equations, 12 figures, 7 tables)

This paper contains 16 sections, 28 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Comparisons of different interaction schemes between the CNN and Transformer. "CT in series" is shown in Figure \ref{['combine']} (a), denoting the series connection of CNN with Transformer; "TC in series" is shown in Figure \ref{['combine']} (b), denoting the series connection of Transformer with CNN; "Parallel" is shown in Figure \ref{['combine']} (c), denoting the parallel connection between CNN and Transformer; "Ours" is shown in Figure \ref{['combine']} (d), denoting the potential interaction between CNN and Transformer.
  • Figure 2: Architecture of our proposed Feature Interaction Weighted Hybrid Network (FIWHN).
  • Figure 3: (a) Structure of the Wide-residual Distillation Interaction Block (WDIB). The $M_i$ and $M_{i-1}$ represent the combination coefficient learning, which can be understood in Figure \ref{['PC']}, and $\odot$ represents the operation of multiplication, ⓢ represents the sigmoid function, $\Theta ({x_i},{y_i}) = {x_i} + {y_i}{M_i}({y_i})$; (b) The structure of the Efficient Transformer (ET).
  • Figure 4: Details of the combination coefficient learning, which corresponds to $M_i$ and $M_{i-1}$ in Figure \ref{['WDIB']} (a).
  • Figure 5: Exploring how to combine CNN and Transformer efficiently and the potential of our method for multiple combinations of both.
  • ...and 7 more figures