Table of Contents
Fetching ...

Complexity Experts are Task-Discriminative Learners for Any Image Restoration

Eduard Zamfir, Zongwei Wu, Nancy Mehta, Yuedong Tan, Danda Pani Paudel, Yulun Zhang, Radu Timofte

TL;DR

MoCE-IR addresses inefficiency and inconsistent use of experts in all-in-one image restoration by introducing complexity experts and a complexity-aware routing that biases toward simpler, lower-cost experts. The method uses nested experts with increasing capacity and receptive fields, a shared transformer-based path, FFT-based attention, and an image-level routing with a complexity-bias auxiliary loss to achieve task-discriminative allocations. It demonstrates state-of-the-art performance across multiple degradations while reducing computational load, enabling efficient inference by bypassing irrelevant experts. This work advances all-in-one restoration by unifying task-specific processing and cross-task sharing within a scalable, parameter-efficient MoE framework.

Abstract

Recent advancements in all-in-one image restoration models have revolutionized the ability to address diverse degradations through a unified framework. However, parameters tied to specific tasks often remain inactive for other tasks, making mixture-of-experts (MoE) architectures a natural extension. Despite this, MoEs often show inconsistent behavior, with some experts unexpectedly generalizing across tasks while others struggle within their intended scope. This hinders leveraging MoEs' computational benefits by bypassing irrelevant experts during inference. We attribute this undesired behavior to the uniform and rigid architecture of traditional MoEs. To address this, we introduce ``complexity experts" -- flexible expert blocks with varying computational complexity and receptive fields. A key challenge is assigning tasks to each expert, as degradation complexity is unknown in advance. Thus, we execute tasks with a simple bias toward lower complexity. To our surprise, this preference effectively drives task-specific allocation, assigning tasks to experts with the appropriate complexity. Extensive experiments validate our approach, demonstrating the ability to bypass irrelevant experts during inference while maintaining superior performance. The proposed MoCE-IR model outperforms state-of-the-art methods, affirming its efficiency and practical applicability. The source code and models are publicly available at \href{https://eduardzamfir.github.io/moceir/}{\texttt{eduardzamfir.github.io/MoCE-IR/}}

Complexity Experts are Task-Discriminative Learners for Any Image Restoration

TL;DR

MoCE-IR addresses inefficiency and inconsistent use of experts in all-in-one image restoration by introducing complexity experts and a complexity-aware routing that biases toward simpler, lower-cost experts. The method uses nested experts with increasing capacity and receptive fields, a shared transformer-based path, FFT-based attention, and an image-level routing with a complexity-bias auxiliary loss to achieve task-discriminative allocations. It demonstrates state-of-the-art performance across multiple degradations while reducing computational load, enabling efficient inference by bypassing irrelevant experts. This work advances all-in-one restoration by unifying task-specific processing and cross-task sharing within a scalable, parameter-efficient MoE framework.

Abstract

Recent advancements in all-in-one image restoration models have revolutionized the ability to address diverse degradations through a unified framework. However, parameters tied to specific tasks often remain inactive for other tasks, making mixture-of-experts (MoE) architectures a natural extension. Despite this, MoEs often show inconsistent behavior, with some experts unexpectedly generalizing across tasks while others struggle within their intended scope. This hinders leveraging MoEs' computational benefits by bypassing irrelevant experts during inference. We attribute this undesired behavior to the uniform and rigid architecture of traditional MoEs. To address this, we introduce ``complexity experts" -- flexible expert blocks with varying computational complexity and receptive fields. A key challenge is assigning tasks to each expert, as degradation complexity is unknown in advance. Thus, we execute tasks with a simple bias toward lower complexity. To our surprise, this preference effectively drives task-specific allocation, assigning tasks to experts with the appropriate complexity. Extensive experiments validate our approach, demonstrating the ability to bypass irrelevant experts during inference while maintaining superior performance. The proposed MoCE-IR model outperforms state-of-the-art methods, affirming its efficiency and practical applicability. The source code and models are publicly available at \href{https://eduardzamfir.github.io/moceir/}{\texttt{eduardzamfir.github.io/MoCE-IR/}}

Paper Structure

This paper contains 10 sections, 4 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Motivation. (a) Dense all-in-one restoration methods li2022airnetyang2024ldr often inefficiently allocate parameters when handling multiple degradation types. (b) While recent Mixture-of-Experts (MoE) approaches zamfir2024efficientyu2024multi address this through sparse computation, their rigid routing mechanisms uniformly distribute inputs across experts without considering the natural relationships between degradations. (c) To overcome these limitations, we introduce Complexity Experts - adaptive processing blocks with size-varying computational units. Our framework dynamically allocates model capacity using a spring-inspired force mechanism that continuously guides routing decisions toward simpler experts when possible, with the force proportional to the complexity of the input degradation. While initially designed for computational efficiency, this approach naturally emerges as a task-discriminative learning framework, assigning degradations to the most suitable experts. This makes it particularly effective for all-in-one restoration methods, where both task-specific processing and cross-degradation knowledge sharing are crucial.
  • Figure 2: Proposed MoCE-IR framework. Despite recent advances in MoE-based image restoration yang2024ldrzamfir2024detailszamfir2024efficient, inconsistent expert behavior—where some experts over-generalize while others underperform—limits their computational efficiency. We address this through complexity experts: flexible blocks with varying computational capacity and receptive fields. Our MoCE-IR employs an asymmetric encoder-decoder architecture where each decoder block contains a mixture-of-complexity-experts layer for adaptive capacity routing.
  • Figure 3: Visual results. We compare MoCE-IR-S to AirNet li2022airnet, and PromptIR potlapalli2023promptir in the all-in-one setting with three degradations. MoCE-IR-S effectively removes haze and rain streaks while preserving image sharpness, achieving high-quality restoration. An error heatmap is provided, with color transitioning from black to white to indicate increasing pixel-wise error.
  • Figure 4: Complexity-efficiency tradeoff. Visualization of PSNR and parameter counts of proposed method compared to prior work. Proposed MoCE-IR surpasses prior methods, achieving SoTA results in all-in-one image restoration with enhanced efficiency.
  • Figure 5: Routing visualization for the AIO-3 setting. (a) While load balancing riquelme2021scaling ensures uniform expert utilization, it neglects shared task dependencies and task-specific characteristics, limiting restoration quality. (b)-(d) Complexity-aware routing fosters task discrimination by directing complex degradations to experts with broader contextual understanding and vice versa. This allows some experts to generalize across tasks while others specialize, enhancing adaptability. We visualize the average decisions made by each router for dehazing, deraining and denoising. The y-axis indicates increasing expert complexity.