Table of Contents
Fetching ...

CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration

Rui Deng, Tianpei Gu

TL;DR

The paper addresses image restoration by overcoming CNN and Transformer limitations in modeling long-range dependencies and computational cost. It introduces CU-Mamba, a U-Net variant that embeds a Spatial SSM for global spatial context and a Channel SSM for channel mixing, both enabled by a selective gating mechanism that preserves linear complexity, yielding a total cost of $\\mathcal{O}(BE(L+C))$. Through extensive denoising and deblurring experiments, CU-Mamba demonstrates state-of-the-art performance with lower computational overhead compared to Transformer-based methods, supported by ablations that validate the complementary roles of Spatial and Channel SSM blocks. The work demonstrates that jointly modeling spatial and channel contexts in a dual-directional SSM framework provides a practical and effective route for high-quality image restoration at scale.

Abstract

Reconstructing degraded images is a critical task in image processing. Although CNN and Transformer-based models are prevalent in this field, they exhibit inherent limitations, such as inadequate long-range dependency modeling and high computational costs. To overcome these issues, we introduce the Channel-Aware U-Shaped Mamba (CU-Mamba) model, which incorporates a dual State Space Model (SSM) framework into the U-Net architecture. CU-Mamba employs a Spatial SSM module for global context encoding and a Channel SSM component to preserve channel correlation features, both in linear computational complexity relative to the feature map size. Extensive experimental results validate CU-Mamba's superiority over existing state-of-the-art methods, underscoring the importance of integrating both spatial and channel contexts in image restoration.

CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration

TL;DR

The paper addresses image restoration by overcoming CNN and Transformer limitations in modeling long-range dependencies and computational cost. It introduces CU-Mamba, a U-Net variant that embeds a Spatial SSM for global spatial context and a Channel SSM for channel mixing, both enabled by a selective gating mechanism that preserves linear complexity, yielding a total cost of . Through extensive denoising and deblurring experiments, CU-Mamba demonstrates state-of-the-art performance with lower computational overhead compared to Transformer-based methods, supported by ablations that validate the complementary roles of Spatial and Channel SSM blocks. The work demonstrates that jointly modeling spatial and channel contexts in a dual-directional SSM framework provides a practical and effective route for high-quality image restoration at scale.

Abstract

Reconstructing degraded images is a critical task in image processing. Although CNN and Transformer-based models are prevalent in this field, they exhibit inherent limitations, such as inadequate long-range dependency modeling and high computational costs. To overcome these issues, we introduce the Channel-Aware U-Shaped Mamba (CU-Mamba) model, which incorporates a dual State Space Model (SSM) framework into the U-Net architecture. CU-Mamba employs a Spatial SSM module for global context encoding and a Channel SSM component to preserve channel correlation features, both in linear computational complexity relative to the feature map size. Extensive experimental results validate CU-Mamba's superiority over existing state-of-the-art methods, underscoring the importance of integrating both spatial and channel contexts in image restoration.
Paper Structure (14 sections, 4 equations, 4 figures, 4 tables)

This paper contains 14 sections, 4 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The Overall pipeline of CU-Mamba. Each CU-Mamba Block consists of a Spatial SSM block (as explained in $(1)$) followed by a Channel SSM block (as detailed in $(2)$). The structure of SelectiveSSM block is explained in Fig. \ref{['fig:cam_vis']}
  • Figure 2: The structure of SelectiveSSM block. On top of the traditional SSM blocks, the selective SSM adds a SiLU activation similar to the Gated MLPliu2021pay. This Gated design allows the model to fuse and select information across tokens. On the other hand, the Linear and Conv layers allows the model to learn input-dependent parameters.
  • Figure 3: The visualization of image denosing result in SIDDabdelhamed2018high dataset.
  • Figure 4: The visualization of image motion deblurring result in GoPronah2017deep dataset.