Table of Contents
Fetching ...

See More Details: Efficient Image Super-Resolution by Experts Mining

Eduard Zamfir, Zongwei Wu, Nancy Mehta, Yulun Zhang, Radu Timofte

TL;DR

This work tackles the efficiency-accuracy dilemma in single-image super-resolution by introducing SeemoRe, a model that combines multiple experts at macro and micro scales to maximize intra-feature interactions with minimal computation. It features a Rank Modulating Expert (RME) built on a Mixture of Low-Rank Expertise (MoRE) and a Spatial Modulating Expert (SME) complemented by a Spatial Enhancement Expertise (SEE) to emulate local attention efficiently. A top-1 dynamic routing mechanism selects the most relevant expert per layer, enabling significant reductions in GMACS and parameters while achieving state-of-the-art results on standard SR benchmarks. The proposed approach offers a practical and scalable solution for efficient SR, with extensive ablations and visual analyses supporting the effectiveness of the MoRE and SEE components and their synergistic interaction.

Abstract

Reconstructing high-resolution (HR) images from low-resolution (LR) inputs poses a significant challenge in image super-resolution (SR). While recent approaches have demonstrated the efficacy of intricate operations customized for various objectives, the straightforward stacking of these disparate operations can result in a substantial computational burden, hampering their practical utility. In response, we introduce SeemoRe, an efficient SR model employing expert mining. Our approach strategically incorporates experts at different levels, adopting a collaborative methodology. At the macro scale, our experts address rank-wise and spatial-wise informative features, providing a holistic understanding. Subsequently, the model delves into the subtleties of rank choice by leveraging a mixture of low-rank experts. By tapping into experts specialized in distinct key factors crucial for accurate SR, our model excels in uncovering intricate intra-feature details. This collaborative approach is reminiscent of the concept of "see more", allowing our model to achieve an optimal performance with minimal computational costs in efficient settings. The source will be publicly made available at https://github.com/eduardzamfir/seemoredetails

See More Details: Efficient Image Super-Resolution by Experts Mining

TL;DR

This work tackles the efficiency-accuracy dilemma in single-image super-resolution by introducing SeemoRe, a model that combines multiple experts at macro and micro scales to maximize intra-feature interactions with minimal computation. It features a Rank Modulating Expert (RME) built on a Mixture of Low-Rank Expertise (MoRE) and a Spatial Modulating Expert (SME) complemented by a Spatial Enhancement Expertise (SEE) to emulate local attention efficiently. A top-1 dynamic routing mechanism selects the most relevant expert per layer, enabling significant reductions in GMACS and parameters while achieving state-of-the-art results on standard SR benchmarks. The proposed approach offers a practical and scalable solution for efficient SR, with extensive ablations and visual analyses supporting the effectiveness of the MoRE and SEE components and their synergistic interaction.

Abstract

Reconstructing high-resolution (HR) images from low-resolution (LR) inputs poses a significant challenge in image super-resolution (SR). While recent approaches have demonstrated the efficacy of intricate operations customized for various objectives, the straightforward stacking of these disparate operations can result in a substantial computational burden, hampering their practical utility. In response, we introduce SeemoRe, an efficient SR model employing expert mining. Our approach strategically incorporates experts at different levels, adopting a collaborative methodology. At the macro scale, our experts address rank-wise and spatial-wise informative features, providing a holistic understanding. Subsequently, the model delves into the subtleties of rank choice by leveraging a mixture of low-rank experts. By tapping into experts specialized in distinct key factors crucial for accurate SR, our model excels in uncovering intricate intra-feature details. This collaborative approach is reminiscent of the concept of "see more", allowing our model to achieve an optimal performance with minimal computational costs in efficient settings. The source will be publicly made available at https://github.com/eduardzamfir/seemoredetails
Paper Structure (37 sections, 4 equations, 8 figures, 15 tables, 1 algorithm)

This paper contains 37 sections, 4 equations, 8 figures, 15 tables, 1 algorithm.

Figures (8)

  • Figure 1: Model complexity trade-off. Visualization of PSNR, GMACS, and parameter counts on Manga109 dataset for $\times$2 task. Our proposed SeemoRe excels the state-of-the-art CNN-based and lightweight Transformer-based SR models. Marker size indicates parameter counts w.r.t SwinIR-Light liu2021swin.
  • Figure 2: Architecture Overview. SeemoRe refines the feature representations via stacked Residual groups (RGs). Each RG consists of a Rank Modulating Exert (RME) and a Spatial Modulating Expert (SME). RME leverages the Mixture of Low Rank Expertise (MoRE) to refine the global texture, while SME employs spatial enhancement experts (SEE) to supplement RME with spatial cues.
  • Figure 3: Illustration of the proposed Mixture of Low-Rank Expertise (MoRE) as a core block of the RME.
  • Figure 4: Visual comparison of SeemoRe with state-of-the-art methods on challenging cases for $\times 4$ SR from the Urban100 benchmark.
  • Figure 5: Low-Rank Analysis. (a) We plot the decisions made by the routing function for SeemoRe-T over the depth of the network.(b) We visualize the low-rank features of SeemoRe-T for $\times 4$ SR given example images from Urban100 and Manga109.
  • ...and 3 more figures