Table of Contents
Fetching ...

Balancing Efficiency and Quality: MoEISR for Arbitrary-Scale Image Super-Resolution

Young Jae Oh, Jihun Kim, Jihoon Nam, Tae Hyun Kim

TL;DR

This work tackles the inefficiency of arbitrary-scale SR methods that rely on querying a single heavy decoder for every output pixel. It introduces MoEISR, a mixture-of-experts framework that couples an encoder-generated implicit representation with a pool of decoders of varying depths and a per-pixel mapper to assign pixels to appropriate experts, trained with a differentiable Gumbel-Softmax routing and a balance loss. The loss combines reconstruction quality and a balancing term to distribute workload across experts, enabling targeted computation where needed. Empirical results show substantial FLOPs reductions while maintaining or improving PSNR across multiple backbones and datasets, and ablations illuminate the roles of mapper depth, temperature, and controllable routing. Overall, MoEISR offers a flexible, model-agnostic approach to efficient arbitrary-scale SR with potential applicability to other INR-based tasks.

Abstract

Arbitrary-scale image super-resolution employing implicit neural functions has gained significant attention lately due to its capability to upscale images across diverse scales utilizing only a single model. Nevertheless, these methodologies have imposed substantial computational demands as they involve querying every target pixel to a single resource-intensive decoder. In this paper, we introduce a novel and efficient framework, the Mixture-of-Experts Implicit Super-Resolution (MoEISR), which enables super-resolution at arbitrary scales with significantly increased computational efficiency without sacrificing reconstruction quality. MoEISR dynamically allocates the most suitable decoding expert to each pixel using a lightweight mapper module, allowing experts with varying capacities to reconstruct pixels across regions with diverse complexities. Our experiments demonstrate that MoEISR successfully reduces significant amount of floating point operations (FLOPs) while delivering comparable or superior peak signal-to-noise ratio (PSNR).

Balancing Efficiency and Quality: MoEISR for Arbitrary-Scale Image Super-Resolution

TL;DR

This work tackles the inefficiency of arbitrary-scale SR methods that rely on querying a single heavy decoder for every output pixel. It introduces MoEISR, a mixture-of-experts framework that couples an encoder-generated implicit representation with a pool of decoders of varying depths and a per-pixel mapper to assign pixels to appropriate experts, trained with a differentiable Gumbel-Softmax routing and a balance loss. The loss combines reconstruction quality and a balancing term to distribute workload across experts, enabling targeted computation where needed. Empirical results show substantial FLOPs reductions while maintaining or improving PSNR across multiple backbones and datasets, and ablations illuminate the roles of mapper depth, temperature, and controllable routing. Overall, MoEISR offers a flexible, model-agnostic approach to efficient arbitrary-scale SR with potential applicability to other INR-based tasks.

Abstract

Arbitrary-scale image super-resolution employing implicit neural functions has gained significant attention lately due to its capability to upscale images across diverse scales utilizing only a single model. Nevertheless, these methodologies have imposed substantial computational demands as they involve querying every target pixel to a single resource-intensive decoder. In this paper, we introduce a novel and efficient framework, the Mixture-of-Experts Implicit Super-Resolution (MoEISR), which enables super-resolution at arbitrary scales with significantly increased computational efficiency without sacrificing reconstruction quality. MoEISR dynamically allocates the most suitable decoding expert to each pixel using a lightweight mapper module, allowing experts with varying capacities to reconstruct pixels across regions with diverse complexities. Our experiments demonstrate that MoEISR successfully reduces significant amount of floating point operations (FLOPs) while delivering comparable or superior peak signal-to-noise ratio (PSNR).
Paper Structure (11 sections, 7 equations, 4 figures, 5 tables)

This paper contains 11 sections, 7 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Expert map from the mapper. Yellow, green, blue and red pixels in the expert map denote varying levels of reconstruction complexity and their respective associated experts.
  • Figure 2: Arbitrary-scale SR with MoEISR. The INR ($z$) from the encoder ($E_{\theta}$) goes through the mapper ($M_{\omega}$), creating a expert map assigning the most suitable expert to each output pixel. Then, $z$ and a target coordinate $x$ are passed to the designated decoder ($f^j_{\phi_j}$) predicting the pixel's RGB value.
  • Figure 3: Qualitative comparison between MoEISR and the backbone models. RDN Authors16 is used as an encoder for all methods.
  • Figure 5: MoEISR with different configurations. Expert map (experts chosen with the highest probability) of 1-layer mapper (leftmost), 5-layer mapper (second-left), corresponding input image (third-left), $\tau=1$ (third-right), $\tau=5$ (second-right) and corresponding input image (rightmost). Yellow, green, blue, and red pixels denote different layered experts.