Table of Contents
Fetching ...

WaterMamba: Visual State Space Model for Underwater Image Enhancement

Meisheng Guan, Haiyong Xu, Gangyi Jiang, Mei Yu, Yeyao Chen, Ting Luo, Yang Song

TL;DR

WaterMamba tackles underwater image enhancement by adopting a linear-complexity state-space model to efficiently capture long-range dependencies and color distortions. The core innovation, the SCOSS block (comprising SCCOSS and MSFFN), fuses spatial attention and channel-aware cues within a U-Net backbone to reconstruct high-quality underwater images with fewer parameters and lower computational load than Transformer-based rivals. Extensive experiments across UIEB, SQUID, UCCS, and UCIOD demonstrate state-of-the-art PSNR/SSIM and perceptual quality (UIQM/UCIQE) while reducing FLOPs, validating robustness and generalization. The work suggests a practical, scalable path for real-time UIE and invites code release to foster community adoption.

Abstract

Underwater imaging often suffers from low quality due to factors affecting light propagation and absorption in water. To improve image quality, some underwater image enhancement (UIE) methods based on convolutional neural networks (CNN) and Transformer have been proposed. However, CNN-based UIE methods are limited in modeling long-range dependencies, and Transformer-based methods involve a large number of parameters and complex self-attention mechanisms, posing efficiency challenges. Considering computational complexity and severe underwater image degradation, a state space model (SSM) with linear computational complexity for UIE, named WaterMamba, is proposed. We propose spatial-channel omnidirectional selective scan (SCOSS) blocks comprising spatial-channel coordinate omnidirectional selective scan (SCCOSS) modules and a multi-scale feedforward network (MSFFN). The SCOSS block models pixel and channel information flow, addressing dependencies. The MSFFN facilitates information flow adjustment and promotes synchronized operations within SCCOSS modules. Extensive experiments showcase WaterMamba's cutting-edge performance with reduced parameters and computational resources, outperforming state-of-the-art methods on various datasets, validating its effectiveness and generalizability. The code will be released on GitHub after acceptance.

WaterMamba: Visual State Space Model for Underwater Image Enhancement

TL;DR

WaterMamba tackles underwater image enhancement by adopting a linear-complexity state-space model to efficiently capture long-range dependencies and color distortions. The core innovation, the SCOSS block (comprising SCCOSS and MSFFN), fuses spatial attention and channel-aware cues within a U-Net backbone to reconstruct high-quality underwater images with fewer parameters and lower computational load than Transformer-based rivals. Extensive experiments across UIEB, SQUID, UCCS, and UCIOD demonstrate state-of-the-art PSNR/SSIM and perceptual quality (UIQM/UCIQE) while reducing FLOPs, validating robustness and generalization. The work suggests a practical, scalable path for real-time UIE and invites code release to foster community adoption.

Abstract

Underwater imaging often suffers from low quality due to factors affecting light propagation and absorption in water. To improve image quality, some underwater image enhancement (UIE) methods based on convolutional neural networks (CNN) and Transformer have been proposed. However, CNN-based UIE methods are limited in modeling long-range dependencies, and Transformer-based methods involve a large number of parameters and complex self-attention mechanisms, posing efficiency challenges. Considering computational complexity and severe underwater image degradation, a state space model (SSM) with linear computational complexity for UIE, named WaterMamba, is proposed. We propose spatial-channel omnidirectional selective scan (SCOSS) blocks comprising spatial-channel coordinate omnidirectional selective scan (SCCOSS) modules and a multi-scale feedforward network (MSFFN). The SCOSS block models pixel and channel information flow, addressing dependencies. The MSFFN facilitates information flow adjustment and promotes synchronized operations within SCCOSS modules. Extensive experiments showcase WaterMamba's cutting-edge performance with reduced parameters and computational resources, outperforming state-of-the-art methods on various datasets, validating its effectiveness and generalizability. The code will be released on GitHub after acceptance.
Paper Structure (20 sections, 16 equations, 7 figures, 3 tables)

This paper contains 20 sections, 16 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Quantitative comparison of FLOPs and parameter counts of WaterMamba with other state of the art underwater image enhancement and dehazing methods.
  • Figure 2: The architecture of the WaterMamba. (a) SCOSS; (b) SOSS; (c) CCOSS.
  • Figure 3: The visual contrasts from top to bottom for the different datasets are UIEB(R90 and C60)10 and U10015 in each row. (a) raw; (b) Ucolor13; (c) PUIE-Net(MC)14; (d) PUIE-Net(MP)14; (e) PUGAN25; (f) MFEF26; (g) Semi-UIR27; (h) URSCT29; (i) Restormer28; (j) Convformer33 (k) X-CAUNET5; (l) WaterMamba; (m) reference.
  • Figure 4: The visual contrasts from top to bottom for the different datasets are UCCS40, and SQUID8 in each row. (a) raw; (b) Ucolor13; (c) PUIE-Net(MC)14; (d) PUIE-Net(MP)14; (e) PUGAN25; (f) MFEF26; (g) Semi-UIR27; (h) URSCT29; (i) Restormer28; (j) Convformer33 (k) X-CAUNET5; (l) WaterMamba.
  • Figure 5: Ablation experiment on baseline, and network core component modules of the proposed WaterMamba. (a) Raw; (b) UNet with Resblocks24; (c) SOSS; (d) CCOSS (e) MSFFN; (f) WaterMamba; (g) reference.
  • ...and 2 more figures