Multi-Scale Representation Learning for Image Restoration with State-Space Model

Yuhong He; Long Peng; Qiaosi Yi; Chen Wu; Lu Wang

Multi-Scale Representation Learning for Image Restoration with State-Space Model

Yuhong He, Long Peng, Qiaosi Yi, Chen Wu, Lu Wang

TL;DR

The paper tackles real-world image restoration under multi-scale degradations by introducing MS-Mamba, an efficient multi-scale state-space modeling framework embedded in a UNet backbone. It blends global and regional state-space modules (GSSM and RSSM) in a Hierarchical Mamba Block to capture global, regional, and local features, and augments detail extraction with Adaptive Gradient Block and Residual Fourier Block, trained with a composite loss $\ abla L_{total}= \lambda_1 \mathcal{L}_1 + \lambda_2 \mathcal{L}_{edge} + \lambda_3 \mathcal{L}_{fft}$. The approach achieves state-of-the-art results across nine public benchmarks and four restoration tasks (deraining, dehazing, denoising, low-light enhancement) while maintaining lower computational costs than Transformer-heavy methods. This combination of global/regional multi-scale SSMs with frequency- and gradient-based detail modeling yields practical improvements for real-world image restoration, demonstrated by quantitative gains and a user study indicating strong perceptual quality.

Abstract

Image restoration endeavors to reconstruct a high-quality, detail-rich image from a degraded counterpart, which is a pivotal process in photography and various computer vision systems. In real-world scenarios, different types of degradation can cause the loss of image details at various scales and degrade image contrast. Existing methods predominantly rely on CNN and Transformer to capture multi-scale representations. However, these methods are often limited by the high computational complexity of Transformers and the constrained receptive field of CNN, which hinder them from achieving superior performance and efficiency in image restoration. To address these challenges, we propose a novel Multi-Scale State-Space Model-based (MS-Mamba) for efficient image restoration that enhances the capacity for multi-scale representation learning through our proposed global and regional SSM modules. Additionally, an Adaptive Gradient Block (AGB) and a Residual Fourier Block (RFB) are proposed to improve the network's detail extraction capabilities by capturing gradients in various directions and facilitating learning details in the frequency domain. Extensive experiments on nine public benchmarks across four classic image restoration tasks, image deraining, dehazing, denoising, and low-light enhancement, demonstrate that our proposed method achieves new state-of-the-art performance while maintaining low computational complexity. The source code will be publicly available.

Multi-Scale Representation Learning for Image Restoration with State-Space Model

TL;DR

. The approach achieves state-of-the-art results across nine public benchmarks and four restoration tasks (deraining, dehazing, denoising, low-light enhancement) while maintaining lower computational costs than Transformer-heavy methods. This combination of global/regional multi-scale SSMs with frequency- and gradient-based detail modeling yields practical improvements for real-world image restoration, demonstrated by quantitative gains and a user study indicating strong perceptual quality.

Abstract

Paper Structure (17 sections, 5 equations, 6 figures, 6 tables)

This paper contains 17 sections, 5 equations, 6 figures, 6 tables.

Introduction
Related Work
Image Restoration
State Space Models
Method
State Space Models
Hierarchical Mamba Block
Adaptive Gradient Block
Residual Fourier Block
Loss Function
Experiments
Experimental Settings
Quantitative and Qualitative Results
Comparison of model complexity
Ablation Study
...and 2 more sections

Figures (6)

Figure 1: Model complexity and performance comparison between our MS-Mamba and existing state-of-the-art and classic image restoration methods on the SPA-Data dataset. Our method achieves superior performance while maintaining lower computational costs.
Figure 2: Architecture of our proposed MS-Mamba, which adopts a multi-scale UNet architecture and comprises of the novel Hierarchical Mamba Block, Adaptive Gradient Block, and Residual Fourier Block.
Figure 3: Visual comparison on the synthetic rainy images from the Rain200H JORDER dataset. Zooming in the figures offers a better view. Our proposed method effectively removes rain streaks and delivers the most visually pleasing results.
Figure 4: Visual comparison on the real low-light images from the LOLv2 LOLv2 dataset. Zooming in the figures offers a better view. Our proposed method delivers the most visually pleasing results.
Figure 5: Image dehasing visual comparison on the SOTS-indoor sots datasets. Best viewed on screen.
...and 1 more figures

Multi-Scale Representation Learning for Image Restoration with State-Space Model

TL;DR

Abstract

Multi-Scale Representation Learning for Image Restoration with State-Space Model

Authors

TL;DR

Abstract

Table of Contents

Figures (6)