Table of Contents
Fetching ...

FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining

Zou Zhen, Yu Hu, Zhao Feng

TL;DR

This work tackles rain-induced degradation in images, where rain streaks disrupt global frequency patterns and local textures. It introduces FreqMamba, a three-branch Frequency-SSM framework that unifies Spatial Mamba, Frequency Band Mamba, and Fourier-based global modeling to capture both local and global degradation patterns. A data-dependent degradation prior attention mechanism and a composite loss integrating spatial and spectral terms guide training, resulting in state-of-the-art deraining performance on standard benchmarks and extendability to low-light enhancement and dehazing tasks. The approach offers a practical, efficient solution for robust image restoration in adverse weather, with potential applicability to a broader set of vision tasks requiring global and local modeling.

Abstract

Images corrupted by rain streaks often lose vital frequency information for perception, and image deraining aims to solve this issue which relies on global and local degradation modeling. Recent studies have witnessed the effectiveness and efficiency of Mamba for perceiving global and local information based on its exploiting local correlation among patches, however, rarely attempts have been explored to extend it with frequency analysis for image deraining, limiting its ability to perceive global degradation that is relevant to frequency modeling (e.g. Fourier transform). In this paper, we propose FreqMamba, an effective and efficient paradigm that leverages the complementary between Mamba and frequency analysis for image deraining. The core of our method lies in extending Mamba with frequency analysis from two perspectives: extending it with frequency-band for exploiting frequency correlation, and connecting it with Fourier transform for global degradation modeling. Specifically, FreqMamba introduces complementary triple interaction structures including spatial Mamba, frequency band Mamba, and Fourier global modeling. Frequency band Mamba decomposes the image into sub-bands of different frequencies to allow 2D scanning from the frequency dimension. Furthermore, leveraging Mamba's unique data-dependent properties, we use rainy images at different scales to provide degradation priors to the network, thereby facilitating efficient training. Extensive experiments show that our method outperforms state-of-the-art methods both visually and quantitatively.

FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining

TL;DR

This work tackles rain-induced degradation in images, where rain streaks disrupt global frequency patterns and local textures. It introduces FreqMamba, a three-branch Frequency-SSM framework that unifies Spatial Mamba, Frequency Band Mamba, and Fourier-based global modeling to capture both local and global degradation patterns. A data-dependent degradation prior attention mechanism and a composite loss integrating spatial and spectral terms guide training, resulting in state-of-the-art deraining performance on standard benchmarks and extendability to low-light enhancement and dehazing tasks. The approach offers a practical, efficient solution for robust image restoration in adverse weather, with potential applicability to a broader set of vision tasks requiring global and local modeling.

Abstract

Images corrupted by rain streaks often lose vital frequency information for perception, and image deraining aims to solve this issue which relies on global and local degradation modeling. Recent studies have witnessed the effectiveness and efficiency of Mamba for perceiving global and local information based on its exploiting local correlation among patches, however, rarely attempts have been explored to extend it with frequency analysis for image deraining, limiting its ability to perceive global degradation that is relevant to frequency modeling (e.g. Fourier transform). In this paper, we propose FreqMamba, an effective and efficient paradigm that leverages the complementary between Mamba and frequency analysis for image deraining. The core of our method lies in extending Mamba with frequency analysis from two perspectives: extending it with frequency-band for exploiting frequency correlation, and connecting it with Fourier transform for global degradation modeling. Specifically, FreqMamba introduces complementary triple interaction structures including spatial Mamba, frequency band Mamba, and Fourier global modeling. Frequency band Mamba decomposes the image into sub-bands of different frequencies to allow 2D scanning from the frequency dimension. Furthermore, leveraging Mamba's unique data-dependent properties, we use rainy images at different scales to provide degradation priors to the network, thereby facilitating efficient training. Extensive experiments show that our method outperforms state-of-the-art methods both visually and quantitatively.
Paper Structure (20 sections, 10 equations, 9 figures, 5 tables)

This paper contains 20 sections, 10 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Comparison of different modeling methods. Our FreqMamba enhances Mamba's 2D global perception capability from the frequency perspective. Meanwhile, Mamba modeling in frequency dimension is introduced to realize the seamless transition between the spatial and frequency domains.
  • Figure 2: Observation of the spectrum exchange of the Discrete Fourier Transform(DFT). The degradation is mainly in the amplitude component, and the Fourier transform can disentangle image content and degradation to some extent.
  • Figure 3: The comparison between (a) the vanilla scanning strategy employed by VMamba liu2024vmamba and (b) our frequency dimension strategy. Utilizing k-level wavelet packet transform, we decompose the input into $4^k$ ($k$=2 in the figure) frequency bands. It is then scanned along the frequency dimension in the spatial domain. This strategy introduces a new dimension to 2D-Mamba, allowing it to capture complex image details at different frequencies. It is noteworthy that we use a local scanning strategy similar to LocalMambahuang2024localmamba, which is more consistent with the form of WPT and allows strictly frequency-ordered scanning.
  • Figure 4: The detailed architecture of our FreqMamba. The three-branch FreqSSM forms the basic block of the u-net architecture for global and local modeling. Multi-scale degradation priors are introduced into the training process at the encoder stage.
  • Figure 5: Visualization of degradation prior attention maps and three-branch features. (a) Rainy images. (b) Attention maps. (c), (d), and (e) are the feature maps of the spatial, frequency band, and Fourier modeling branch. The spatial branch (c) comprehensively recognizes raindrops but the boundaries are blurry. The Fourier branch (e) outputs high-contrast features and focuses more on larger rain streaks. Overall the frequency band branch (d) is somewhere in between.
  • ...and 4 more figures