Table of Contents
Fetching ...

PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement

Wei-Tung Lin, Yong-Xiang Lin, Jyun-Wei Chen, Kai-Lung Hua

TL;DR

PixMamba tackles underwater image enhancement by marrying a patch-level Efficient Mamba Net (EMNet) with a parallel pixel-level PixMamba Net (PixNet) to capture both local detail and global consistency. It leverages Structured State Space Models (SSMs) to achieve linear complexity while modeling long-range dependencies, with EMNet handling patch-based restoration through ESS2D and MUB, and PixNet handling full-image pixel-level refinement aided by Block-wise Positional Embedding (BPE). The approach yields state-of-the-art performance on UIEB and UCCS datasets, with strong qualitative and quantitative gains and thorough ablations confirming each component’s contribution. This dual-level architecture combines efficiency and detail preservation, advancing UIE toward real-time, high-fidelity underwater restoration.

Abstract

Underwater Image Enhancement (UIE) is critical for marine research and exploration but hindered by complex color distortions and severe blurring. Recent deep learning-based methods have achieved remarkable results, yet these methods struggle with high computational costs and insufficient global modeling, resulting in locally under- or over- adjusted regions. We present PixMamba, a novel architecture, designed to overcome these challenges by leveraging State Space Models (SSMs) for efficient global dependency modeling. Unlike convolutional neural networks (CNNs) with limited receptive fields and transformer networks with high computational costs, PixMamba efficiently captures global contextual information while maintaining computational efficiency. Our dual-level strategy features the patch-level Efficient Mamba Net (EMNet) for reconstructing enhanced image feature and the pixel-level PixMamba Net (PixNet) to ensure fine-grained feature capturing and global consistency of enhanced image that were previously difficult to obtain. PixMamba achieves state-of-the-art performance across various underwater image datasets and delivers visually superior results. Code is available at: https://github.com/weitunglin/pixmamba.

PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement

TL;DR

PixMamba tackles underwater image enhancement by marrying a patch-level Efficient Mamba Net (EMNet) with a parallel pixel-level PixMamba Net (PixNet) to capture both local detail and global consistency. It leverages Structured State Space Models (SSMs) to achieve linear complexity while modeling long-range dependencies, with EMNet handling patch-based restoration through ESS2D and MUB, and PixNet handling full-image pixel-level refinement aided by Block-wise Positional Embedding (BPE). The approach yields state-of-the-art performance on UIEB and UCCS datasets, with strong qualitative and quantitative gains and thorough ablations confirming each component’s contribution. This dual-level architecture combines efficiency and detail preservation, advancing UIE toward real-time, high-fidelity underwater restoration.

Abstract

Underwater Image Enhancement (UIE) is critical for marine research and exploration but hindered by complex color distortions and severe blurring. Recent deep learning-based methods have achieved remarkable results, yet these methods struggle with high computational costs and insufficient global modeling, resulting in locally under- or over- adjusted regions. We present PixMamba, a novel architecture, designed to overcome these challenges by leveraging State Space Models (SSMs) for efficient global dependency modeling. Unlike convolutional neural networks (CNNs) with limited receptive fields and transformer networks with high computational costs, PixMamba efficiently captures global contextual information while maintaining computational efficiency. Our dual-level strategy features the patch-level Efficient Mamba Net (EMNet) for reconstructing enhanced image feature and the pixel-level PixMamba Net (PixNet) to ensure fine-grained feature capturing and global consistency of enhanced image that were previously difficult to obtain. PixMamba achieves state-of-the-art performance across various underwater image datasets and delivers visually superior results. Code is available at: https://github.com/weitunglin/pixmamba.
Paper Structure (20 sections, 4 equations, 4 figures, 3 tables)

This paper contains 20 sections, 4 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overall architecture of PixMamba. EMNet: Efficient Mamba Net; EMB: Efficient Mamba Block; PixNet: PixMamba Net; MUB: Mamba Upsampling Block; DS: Downsampling Block; DWConv: Depth-wise Convolution Block; S6: Mamba SSMgu2023mamba.
  • Figure 2: Enhanced image detail visualization. Our method improves the detail features of the degraded image compared to WaterMambaguan2024watermamba and Semi-UIRhuang2023semiuir As highlighted in the red circle, our approach shows superior result on the detail features over WaterMambaguan2024watermamba and Semi-UIRhuang2023semiuir, demonstrating the advantage of our proposed MUB and PixNet techniques.
  • Figure 3: The qualitative comparisons. T90li2020uieb samples are presented in each row from top to bottom. (a) raw; (b) Ucolorli2021ucolor; (c) PUGANcong2023PUGAN; (d) MFEFzhou2023mfef; (e) Semi-UIRhuang2023semiuir; (f) Convformergu2022convformer; (g) X-CAUNETpramanick2024xcaunet; (h) WaterMambaguan2024watermamba; (i) PixMamba; (j) reference.
  • Figure 4: The qualitative comparisons. First and second row are C60li2020uieb samples. Third row is UCCSliu2020uccs samples. (a) raw; (b) Ucolorli2021ucolor; (c) PUGANcong2023PUGAN; (d) MFEFzhou2023mfef; (e) Semi-UIRhuang2023semiuir; (f) Convformergu2022convformer; (g) X-CAUNETpramanick2024xcaunet; (h) WaterMambaguan2024watermamba; (i) PixMamba.