Table of Contents
Fetching ...

M2IR: Proactive All-in-One Image Restoration via Mamba-style Modulation and Mixture-of-Experts

Shiwei Wang, Yongzhen Wang, Bingwen Hu, Liyan Zhang, Xiao-Ping Zhang, Mingqiang Wei

Abstract

While Transformer-based architectures have dominated recent advances in all-in-one image restoration, they remain fundamentally reactive: propagating degradations rather than proactively suppressing them. In the absence of explicit suppression mechanisms, degraded signals interfere with feature learning, compelling the decoder to balance artifact removal and detail preservation, thereby increasing model complexity and limiting adaptability. To address these challenges, we propose M2IR, a novel restoration framework that proactively regulates degradation propagation during the encoding stage and efficiently eliminates residual degradations during decoding. Specifically, the Mamba-Style Transformer (MST) block performs pixel-wise selective state modulation to mitigate degradations while preserving structural integrity. In parallel, the Adaptive Degradation Expert Collaboration (ADEC) module utilizes degradation-specific experts guided by a DA-CLIP-driven router and complemented by a shared expert to eliminate residual degradations through targeted and cooperative restoration. By integrating the MST block and ADEC module, M2IR transitions from passive reaction to active degradation control, effectively harnessing learned representations to achieve superior generalization, enhanced adaptability, and refined recovery of fine-grained details across diverse all-in-one image restoration benchmarks. Our source codes are available at https://github.com/Im34v/M2IR.

M2IR: Proactive All-in-One Image Restoration via Mamba-style Modulation and Mixture-of-Experts

Abstract

While Transformer-based architectures have dominated recent advances in all-in-one image restoration, they remain fundamentally reactive: propagating degradations rather than proactively suppressing them. In the absence of explicit suppression mechanisms, degraded signals interfere with feature learning, compelling the decoder to balance artifact removal and detail preservation, thereby increasing model complexity and limiting adaptability. To address these challenges, we propose M2IR, a novel restoration framework that proactively regulates degradation propagation during the encoding stage and efficiently eliminates residual degradations during decoding. Specifically, the Mamba-Style Transformer (MST) block performs pixel-wise selective state modulation to mitigate degradations while preserving structural integrity. In parallel, the Adaptive Degradation Expert Collaboration (ADEC) module utilizes degradation-specific experts guided by a DA-CLIP-driven router and complemented by a shared expert to eliminate residual degradations through targeted and cooperative restoration. By integrating the MST block and ADEC module, M2IR transitions from passive reaction to active degradation control, effectively harnessing learned representations to achieve superior generalization, enhanced adaptability, and refined recovery of fine-grained details across diverse all-in-one image restoration benchmarks. Our source codes are available at https://github.com/Im34v/M2IR.
Paper Structure (17 sections, 11 equations, 10 figures, 13 tables)

This paper contains 17 sections, 11 equations, 10 figures, 13 tables.

Figures (10)

  • Figure 1: Performance comparison of recent all-in-one restoration methods under different settings. Here, 3D indicates three degradation types encompassing haze, rain, and noise, while 5D extends 3D by adding blur and low-light. The CDD11 dataset contains four fundamental degradation types, low-light, hazy, rainy, and snowy, as well as seven composite degradation types generated by combining these fundamental types.
  • Figure 2: Overview of the proposed M2IR. (a) Mamba-Style Transformer (MST): Combining the strengths of Mamba and Transformer, it achieves proactive suppression or regulation of degradation information. (b) Adaptive Degradation Expert Collaboration (ADEC): It utilizes DA-CLIP's image encoder and text encoder to obtain degradation-aware contextual priors, further deriving routing weights, and then aggregates the results from the selected $K$ most relevant experts and a shared expert to eliminate residual degradations effectively.
  • Figure 3: Macro architecture illustrations of (a) Transformer zamir2022restormer with MDTA and DGFN, (b) Mamba mamba with SSM, and (c) the proposed Mamba-Style Transformer (MST) featuring Mamba-Style Attention (MSA).
  • Figure 4: The architecture of the Degradation-Aware Contextual Priors (DACP). DACP leverages pre-trained DA-CLIP to extract degradation-aware priors, which guide the generation of more accurate routing weights.
  • Figure 5: Architecture of the Degradation-Guided Expert Aggregation (DGEA). For each pixel in the feature map, DGEA selects the $K$ most relevant experts based on routing weights and performs degradation-specific processing in collaboration with the shared expert.
  • ...and 5 more figures