Table of Contents
Fetching ...

Mixture of Ranks with Degradation-Aware Routing for One-Step Real-World Image Super-Resolution

Xiao He, Zhijun Tu, Kun Cheng, Mingrui Zhu, Jie Hu, Nannan Wang, Xinbo Gao

TL;DR

MoR-DASR tackles Real-ISR under real-world degradations by embedding a degradation-aware Mixture-of-Ranks within a LoRA-fine-tuned diffusion framework. A CLIP-based degradation estimator dynamically guides per-input routing, while zero-expert slots and a degradation-aware load-balancing loss adapt computational budgets. The approach achieves state-of-the-art performance among one-step Real-ISR methods and delivers substantial inference speedups versus multi-step baselines. These results highlight an effective path toward efficient, high-fidelity restoration under heterogeneous real-world degradation conditions.

Abstract

The demonstrated success of sparsely-gated Mixture-of-Experts (MoE) architectures, exemplified by models such as DeepSeek and Grok, has motivated researchers to investigate their adaptation to diverse domains. In real-world image super-resolution (Real-ISR), existing approaches mainly rely on fine-tuning pre-trained diffusion models through Low-Rank Adaptation (LoRA) module to reconstruct high-resolution (HR) images. However, these dense Real-ISR models are limited in their ability to adaptively capture the heterogeneous characteristics of complex real-world degraded samples or enable knowledge sharing between inputs under equivalent computational budgets. To address this, we investigate the integration of sparse MoE into Real-ISR and propose a Mixture-of-Ranks (MoR) architecture for single-step image super-resolution. We introduce a fine-grained expert partitioning strategy that treats each rank in LoRA as an independent expert. This design enables flexible knowledge recombination while isolating fixed-position ranks as shared experts to preserve common-sense features and minimize routing redundancy. Furthermore, we develop a degradation estimation module leveraging CLIP embeddings and predefined positive-negative text pairs to compute relative degradation scores, dynamically guiding expert activation. To better accommodate varying sample complexities, we incorporate zero-expert slots and propose a degradation-aware load-balancing loss, which dynamically adjusts the number of active experts based on degradation severity, ensuring optimal computational resource allocation. Comprehensive experiments validate our framework's effectiveness and state-of-the-art performance.

Mixture of Ranks with Degradation-Aware Routing for One-Step Real-World Image Super-Resolution

TL;DR

MoR-DASR tackles Real-ISR under real-world degradations by embedding a degradation-aware Mixture-of-Ranks within a LoRA-fine-tuned diffusion framework. A CLIP-based degradation estimator dynamically guides per-input routing, while zero-expert slots and a degradation-aware load-balancing loss adapt computational budgets. The approach achieves state-of-the-art performance among one-step Real-ISR methods and delivers substantial inference speedups versus multi-step baselines. These results highlight an effective path toward efficient, high-fidelity restoration under heterogeneous real-world degradation conditions.

Abstract

The demonstrated success of sparsely-gated Mixture-of-Experts (MoE) architectures, exemplified by models such as DeepSeek and Grok, has motivated researchers to investigate their adaptation to diverse domains. In real-world image super-resolution (Real-ISR), existing approaches mainly rely on fine-tuning pre-trained diffusion models through Low-Rank Adaptation (LoRA) module to reconstruct high-resolution (HR) images. However, these dense Real-ISR models are limited in their ability to adaptively capture the heterogeneous characteristics of complex real-world degraded samples or enable knowledge sharing between inputs under equivalent computational budgets. To address this, we investigate the integration of sparse MoE into Real-ISR and propose a Mixture-of-Ranks (MoR) architecture for single-step image super-resolution. We introduce a fine-grained expert partitioning strategy that treats each rank in LoRA as an independent expert. This design enables flexible knowledge recombination while isolating fixed-position ranks as shared experts to preserve common-sense features and minimize routing redundancy. Furthermore, we develop a degradation estimation module leveraging CLIP embeddings and predefined positive-negative text pairs to compute relative degradation scores, dynamically guiding expert activation. To better accommodate varying sample complexities, we incorporate zero-expert slots and propose a degradation-aware load-balancing loss, which dynamically adjusts the number of active experts based on degradation severity, ensuring optimal computational resource allocation. Comprehensive experiments validate our framework's effectiveness and state-of-the-art performance.

Paper Structure

This paper contains 22 sections, 17 equations, 9 figures, 13 tables.

Figures (9)

  • Figure 1: Performance Comparison. Compared to other Real-ISR methods, MoR-DASR achieves superior performance with just a single diffusion step.
  • Figure 2: The training framework of MoR-DASR. The LR image is passed through a trainable encoder $Enc_{\theta}$, a diffusion network with a MoR module $\epsilon_{\theta}$ and a frozen decoder $Dec$ to obtain the desired HR image. The training procedure alternates between two phases: 1. Optimizing the variational score network $\epsilon_{\psi}$ through diffusion loss $\mathcal{L}_{diff}$ to fit the distribution of the generated samples. 2. Finetune the diffusion model $\epsilon_{\theta}$ and encoder $Enc_{\theta}$ to generate high-quality samples through reconstruction loss $\mathcal{L}_{rec}$, variational score distillation loss $\mathcal{L}_{VSD}$, and GAN loss $\mathcal{L}_{GAN}$.
  • Figure 3: The framework of degradation estimation module.
  • Figure 4: Comparison of LoRA, LoRA MoE and MoR. In MoR, each rank is treated as an expert. A subset of these ranks is designated as shared experts to process all samples, while the remaining ranks function as routed experts that are selectively activated to process specific samples.
  • Figure 5: Visual comparisons of different Real-ISR methods. Please zoom in for a better view.
  • ...and 4 more figures