Table of Contents
Fetching ...

Multi-Depth Branch Network for Efficient Image Super-Resolution

Huiyuan Tian, Li Zhang, Shijian Li, Min Yao, Gang Pan

TL;DR

The paper tackles efficient single-image super-resolution by introducing MDBN, an asymmetric CNN that uses Multi-Depth Branch Modules to separate and fuse high- and low-frequency information. MDBM’s architecture combines a two-layer high-frequency path with a one-layer low-frequency path, merging outputs via additive fusion and GELU activation, and is repeated in residual multi-depth blocks with a lightweight upsampler to achieve fast inference. The authors validate the approach with extensive experiments on standard SR benchmarks, showing state-of-the-art efficiency and competitive or superior PSNR/SSIM across $2\times$, $3\times$, and $4\times$ scales, along with qualitative results demonstrating structural coherence and texture fidelity. A novel Fourier spectral analysis framework is proposed to quantify frequency-domain differentiation between branches, revealing reduced feature redundancy and effective high-/low-frequency integration. All results point to MDBN as a practical, high-performance option for real-time SR on resource-constrained devices, with code available at the project repository.

Abstract

A longstanding challenge in Super-Resolution (SR) is how to efficiently enhance high-frequency details in Low-Resolution (LR) images while maintaining semantic coherence. This is particularly crucial in practical applications where SR models are often deployed on low-power devices. To address this issue, we propose an innovative asymmetric SR architecture featuring Multi-Depth Branch Module (MDBM). These MDBMs contain branches of different depths, designed to capture high- and low-frequency information simultaneously and efficiently. The hierarchical structure of MDBM allows the deeper branch to gradually accumulate fine-grained local details under the contextual guidance of the shallower branch. We visualize this process using feature maps, and further demonstrate the rationality and effectiveness of this design using proposed novel Fourier spectral analysis methods. Moreover, our model exhibits more significant spectral differentiation between branches than existing branch networks. This suggests that MDBM reduces feature redundancy and offers a more effective method for integrating high- and low-frequency information. Extensive qualitative and quantitative evaluations on various datasets show that our model can generate structurally consistent and visually realistic HR images. It achieves state-of-the-art (SOTA) results at a very fast inference speed. Our code is available at https://github.com/thy960112/MDBN.

Multi-Depth Branch Network for Efficient Image Super-Resolution

TL;DR

The paper tackles efficient single-image super-resolution by introducing MDBN, an asymmetric CNN that uses Multi-Depth Branch Modules to separate and fuse high- and low-frequency information. MDBM’s architecture combines a two-layer high-frequency path with a one-layer low-frequency path, merging outputs via additive fusion and GELU activation, and is repeated in residual multi-depth blocks with a lightweight upsampler to achieve fast inference. The authors validate the approach with extensive experiments on standard SR benchmarks, showing state-of-the-art efficiency and competitive or superior PSNR/SSIM across , , and scales, along with qualitative results demonstrating structural coherence and texture fidelity. A novel Fourier spectral analysis framework is proposed to quantify frequency-domain differentiation between branches, revealing reduced feature redundancy and effective high-/low-frequency integration. All results point to MDBN as a practical, high-performance option for real-time SR on resource-constrained devices, with code available at the project repository.

Abstract

A longstanding challenge in Super-Resolution (SR) is how to efficiently enhance high-frequency details in Low-Resolution (LR) images while maintaining semantic coherence. This is particularly crucial in practical applications where SR models are often deployed on low-power devices. To address this issue, we propose an innovative asymmetric SR architecture featuring Multi-Depth Branch Module (MDBM). These MDBMs contain branches of different depths, designed to capture high- and low-frequency information simultaneously and efficiently. The hierarchical structure of MDBM allows the deeper branch to gradually accumulate fine-grained local details under the contextual guidance of the shallower branch. We visualize this process using feature maps, and further demonstrate the rationality and effectiveness of this design using proposed novel Fourier spectral analysis methods. Moreover, our model exhibits more significant spectral differentiation between branches than existing branch networks. This suggests that MDBM reduces feature redundancy and offers a more effective method for integrating high- and low-frequency information. Extensive qualitative and quantitative evaluations on various datasets show that our model can generate structurally consistent and visually realistic HR images. It achieves state-of-the-art (SOTA) results at a very fast inference speed. Our code is available at https://github.com/thy960112/MDBN.
Paper Structure (18 sections, 7 equations, 7 figures, 3 tables)

This paper contains 18 sections, 7 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Comparisons of PSNR, SSIM and inference time between our proposed MDBN model and other efficient methods for $\times4$ SR. The size of circles in the graph represents the SSIM value. Both PSNR and SSIM are evaluated on Set5 benchmark 2012Set5. Inference time is the average test time over a set of 50 LR images with dimensions of $320 \times 180$ pixels, using an NVIDIA RTX 3090 GPU.
  • Figure 2: An overview of the proposed MDBN. The MDBN first converts the LR image inputs into feature space utilizing an initial convolutional layer. Next, it employs a set of Residual Multi-Depth Branches (RMDBs) blocks to extract features, followed by an upsampler module for image reconstruction. The RMDB block consists of two MDBMs layers.
  • Figure 3: (a) The details of the proposed Multi-Depth Branch Module (MDBM) and its corresponding feature map visualizations on $\times2$ SR. (b) Normalized Fourier spectra of feature maps from different branches. The region with larger values in the center shows more low-frequency content. It can be observed from $\hat{\mathbf{F}}_{\mathrm{hf}}$ that the deeper branch mainly predicts the high-frequency information and captures fine-grained object details. Meanwhile, $\hat{\mathbf{F}}_{\mathrm{lf}}$ reveals that the shallower branch contains more low-frequency semantic understanding and delineates broader object contours.
  • Figure 4: Visual comparisons for $\times2$ and $\times3$ SR. (a) HR patch of ground truth images, (b) Bicubic, (c) LapSRN lai2017LapSRN, (d) CARN ahn2018CARN, (e) IMDN hui2019IMDN, (f) ShuffleMixer sun2022ShuffleMixerNIPS, (g) SAFM sun2023SAFM, (h) MDBN (ours).
  • Figure 5: Visual comparisons for $\times4$ SR. (a) HR patch of ground truth images, (b) Bicubic, (c) LapSRN lai2017LapSRN, (d) CARN ahn2018CARN, (e) IMDN hui2019IMDN, (f) ShuffleMixer sun2022ShuffleMixerNIPS, (g) SAFM sun2023SAFM, (h) MDBN (ours).
  • ...and 2 more figures