Balancing Efficiency and Quality: MoEISR for Arbitrary-Scale Image Super-Resolution
Young Jae Oh, Jihun Kim, Jihoon Nam, Tae Hyun Kim
TL;DR
This work tackles the inefficiency of arbitrary-scale SR methods that rely on querying a single heavy decoder for every output pixel. It introduces MoEISR, a mixture-of-experts framework that couples an encoder-generated implicit representation with a pool of decoders of varying depths and a per-pixel mapper to assign pixels to appropriate experts, trained with a differentiable Gumbel-Softmax routing and a balance loss. The loss combines reconstruction quality and a balancing term to distribute workload across experts, enabling targeted computation where needed. Empirical results show substantial FLOPs reductions while maintaining or improving PSNR across multiple backbones and datasets, and ablations illuminate the roles of mapper depth, temperature, and controllable routing. Overall, MoEISR offers a flexible, model-agnostic approach to efficient arbitrary-scale SR with potential applicability to other INR-based tasks.
Abstract
Arbitrary-scale image super-resolution employing implicit neural functions has gained significant attention lately due to its capability to upscale images across diverse scales utilizing only a single model. Nevertheless, these methodologies have imposed substantial computational demands as they involve querying every target pixel to a single resource-intensive decoder. In this paper, we introduce a novel and efficient framework, the Mixture-of-Experts Implicit Super-Resolution (MoEISR), which enables super-resolution at arbitrary scales with significantly increased computational efficiency without sacrificing reconstruction quality. MoEISR dynamically allocates the most suitable decoding expert to each pixel using a lightweight mapper module, allowing experts with varying capacities to reconstruct pixels across regions with diverse complexities. Our experiments demonstrate that MoEISR successfully reduces significant amount of floating point operations (FLOPs) while delivering comparable or superior peak signal-to-noise ratio (PSNR).
