Table of Contents
Fetching ...

Enhancing Underwater Images via Adaptive Semantic-aware Codebook Learning

Bosen Lin, Feng Gao, Yanwei Yu, Junyu Dong, Qian Du

TL;DR

This work tackles the ill-posed nature of Underwater Image Enhancement (UIE) by introducing SUCode, a semantic-aware codebook network that uses pixel-level, category-specific codebooks to model region-dependent degradations. It employs a three-stage training paradigm—semantic-codebook pretraining, self-reconstruction of raw underwater content, and domain-adaptive enhancement via FAFF and GCAM—to avoid pseudo-ground-truth contamination and to align restoration with semantic structure. SUCode delivers state-of-the-art performance on full-reference metrics across public benchmarks and demonstrates robust cross-dataset generalization, while maintaining semantic content for downstream tasks like segmentation. The approach offers practical impact for reliable underwater perception in robotics and environmental monitoring, with public code to facilitate further research and application.

Abstract

Underwater Image Enhancement (UIE) is an ill-posed problem where natural clean references are not available, and the degradation levels vary significantly across semantic regions. Existing UIE methods treat images with a single global model and ignore the inconsistent degradation of different scene components. This oversight leads to significant color distortions and loss of fine details in heterogeneous underwater scenes, especially where degradation varies significantly across different image regions. Therefore, we propose SUCode (Semantic-aware Underwater Codebook Network), which achieves adaptive UIE from semantic-aware discrete codebook representation. Compared with one-shot codebook-based methods, SUCode exploits semantic-aware, pixel-level codebook representation tailored to heterogeneous underwater degradation. A three-stage training paradigm is employed to represent raw underwater image features to avoid pseudo ground-truth contamination. Gated Channel Attention Module (GCAM) and Frequency-Aware Feature Fusion (FAFF) jointly integrate channel and frequency cues for faithful color restoration and texture recovery. Extensive experiments on multiple benchmarks demonstrate that SUCode achieves state-of-the-art performance, outperforming recent UIE methods on both reference and no-reference metrics. The code will be made public available at https://github.com/oucailab/SUCode.

Enhancing Underwater Images via Adaptive Semantic-aware Codebook Learning

TL;DR

This work tackles the ill-posed nature of Underwater Image Enhancement (UIE) by introducing SUCode, a semantic-aware codebook network that uses pixel-level, category-specific codebooks to model region-dependent degradations. It employs a three-stage training paradigm—semantic-codebook pretraining, self-reconstruction of raw underwater content, and domain-adaptive enhancement via FAFF and GCAM—to avoid pseudo-ground-truth contamination and to align restoration with semantic structure. SUCode delivers state-of-the-art performance on full-reference metrics across public benchmarks and demonstrates robust cross-dataset generalization, while maintaining semantic content for downstream tasks like segmentation. The approach offers practical impact for reliable underwater perception in robotics and environmental monitoring, with public code to facilitate further research and application.

Abstract

Underwater Image Enhancement (UIE) is an ill-posed problem where natural clean references are not available, and the degradation levels vary significantly across semantic regions. Existing UIE methods treat images with a single global model and ignore the inconsistent degradation of different scene components. This oversight leads to significant color distortions and loss of fine details in heterogeneous underwater scenes, especially where degradation varies significantly across different image regions. Therefore, we propose SUCode (Semantic-aware Underwater Codebook Network), which achieves adaptive UIE from semantic-aware discrete codebook representation. Compared with one-shot codebook-based methods, SUCode exploits semantic-aware, pixel-level codebook representation tailored to heterogeneous underwater degradation. A three-stage training paradigm is employed to represent raw underwater image features to avoid pseudo ground-truth contamination. Gated Channel Attention Module (GCAM) and Frequency-Aware Feature Fusion (FAFF) jointly integrate channel and frequency cues for faithful color restoration and texture recovery. Extensive experiments on multiple benchmarks demonstrate that SUCode achieves state-of-the-art performance, outperforming recent UIE methods on both reference and no-reference metrics. The code will be made public available at https://github.com/oucailab/SUCode.
Paper Structure (17 sections, 20 equations, 12 figures, 10 tables)

This paper contains 17 sections, 20 equations, 12 figures, 10 tables.

Figures (12)

  • Figure 1: The comparison of the training and testing pipeline and enhance results between different codebook generation methods. The proposed SUCode's result is sharper and clearer, with more natural color.
  • Figure 2: The overall structure of the proposed SUCode. In stage I, the semantic-aware category‑specific codebooks are updated with the mask $m$. Stage II is a partition and synthesis process of the codebook, achieved through the self-reconstruction of raw underwater images. In stage III, domain conversion from the raw underwater codebook to the enhanced image is achieved.
  • Figure 3: Visualization of the learned codebooks for different underwater categories.
  • Figure 4: Structure of the proposed Gated Channel Attention Module (GCAM) and Frequency-Aware Feature Fusion (FAFF) module. rFFT and IrFFT represent real fast fourier transform and inverse real fast fourier transform, respecively. Mag and Pha represent the magnitude and phase variables, respectively. GAP refer to the global average pool operation.
  • Figure 5: Visual comparison of UIE results sampled from the test set of SUIM-E qiSGUIENetSemanticAttention2022 dataset. The highest PSNR results are highlighted in red and the second best results are yellow.
  • ...and 7 more figures