Table of Contents
Fetching ...

Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement

Dehuan Zhang, Jingchun Zhou, ChunLe Guo, Weishi Zhang, Chongyi Li

TL;DR

Underwater images suffer from light absorption and scattering, modeled by $I = J \cdot t + A(1 - t)$, which degrades color and contrast. The authors introduce SMDR-IS, a four-stage multi-degradation encoder-decoder framework that leverages intrinsic supervision to refine details across multiple scales. Key innovations include Adaptive Selective Intrinsic Supervised Feature (ASISF) for cross-scale feature propagation and the Bifocal Intrinsic-Context Attention (BICA) with its ReGIA and HCAFE components to balance local detail and contextual information, guided by resolution-level supervision. A multi-degradation loss enforces scale-aware learning across stages, and experiments on UIEB, U45, and LSUI show state-of-the-art performance with competitive efficiency; the code is publicly available at the provided GitHub link.

Abstract

Visually restoring underwater scenes primarily involves mitigating interference from underwater media. Existing methods ignore the inherent scale-related characteristics in underwater scenes. Therefore, we present the synergistic multi-scale detail refinement via intrinsic supervision (SMDR-IS) for enhancing underwater scene details, which contain multi-stages. The low-degradation stage from the original images furnishes the original stage with multi-scale details, achieved through feature propagation using the Adaptive Selective Intrinsic Supervised Feature (ASISF) module. By using intrinsic supervision, the ASISF module can precisely control and guide feature transmission across multi-degradation stages, enhancing multi-scale detail refinement and minimizing the interference from irrelevant information in the low-degradation stage. In multi-degradation encoder-decoder framework of SMDR-IS, we introduce the Bifocal Intrinsic-Context Attention Module (BICA). Based on the intrinsic supervision principles, BICA efficiently exploits multi-scale scene information in images. BICA directs higher-resolution spaces by tapping into the insights of lower-resolution ones, underscoring the pivotal role of spatial contextual relationships in underwater image restoration. Throughout training, the inclusion of a multi-degradation loss function can enhance the network, allowing it to adeptly extract information across diverse scales. When benchmarked against state-of-the-art methods, SMDR-IS consistently showcases superior performance. The code is publicly available at: https://github.com/zhoujingchun03/SMDR-IS.

Synergistic Multiscale Detail Refinement via Intrinsic Supervision for Underwater Image Enhancement

TL;DR

Underwater images suffer from light absorption and scattering, modeled by , which degrades color and contrast. The authors introduce SMDR-IS, a four-stage multi-degradation encoder-decoder framework that leverages intrinsic supervision to refine details across multiple scales. Key innovations include Adaptive Selective Intrinsic Supervised Feature (ASISF) for cross-scale feature propagation and the Bifocal Intrinsic-Context Attention (BICA) with its ReGIA and HCAFE components to balance local detail and contextual information, guided by resolution-level supervision. A multi-degradation loss enforces scale-aware learning across stages, and experiments on UIEB, U45, and LSUI show state-of-the-art performance with competitive efficiency; the code is publicly available at the provided GitHub link.

Abstract

Visually restoring underwater scenes primarily involves mitigating interference from underwater media. Existing methods ignore the inherent scale-related characteristics in underwater scenes. Therefore, we present the synergistic multi-scale detail refinement via intrinsic supervision (SMDR-IS) for enhancing underwater scene details, which contain multi-stages. The low-degradation stage from the original images furnishes the original stage with multi-scale details, achieved through feature propagation using the Adaptive Selective Intrinsic Supervised Feature (ASISF) module. By using intrinsic supervision, the ASISF module can precisely control and guide feature transmission across multi-degradation stages, enhancing multi-scale detail refinement and minimizing the interference from irrelevant information in the low-degradation stage. In multi-degradation encoder-decoder framework of SMDR-IS, we introduce the Bifocal Intrinsic-Context Attention Module (BICA). Based on the intrinsic supervision principles, BICA efficiently exploits multi-scale scene information in images. BICA directs higher-resolution spaces by tapping into the insights of lower-resolution ones, underscoring the pivotal role of spatial contextual relationships in underwater image restoration. Throughout training, the inclusion of a multi-degradation loss function can enhance the network, allowing it to adeptly extract information across diverse scales. When benchmarked against state-of-the-art methods, SMDR-IS consistently showcases superior performance. The code is publicly available at: https://github.com/zhoujingchun03/SMDR-IS.
Paper Structure (16 sections, 6 equations, 5 figures, 6 tables)

This paper contains 16 sections, 6 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Illustration of motivation. The figure showcases similar scene information extracted from multi-resolution images alongside transmission. Degradation patterns, consistent across different positions, are evident in both the original and the downscaled images.
  • Figure 2: The overview of SMDR-IS. In Decoder, $E_i^1$ is the same as in Encoder. $FE_i$, $i \in \{ or,2D,4D,8D \}$ is Feature Extraction (Conv). $E_i^1$, $D_i^1$ denote the Bifocal Intrinsic-Context Attention (BICA). $E_i^j$, $j \in [2,4]$ is Downsamling+BICA. $D_i^j$ is Upsamling+BICA. $S$ denotes Adaptive Selective Intrinsic Supervised Feature Module (ASISF). $FR_i$ is Feature Restoration (Conv).
  • Figure 3: (a) Architecture of BICA, LN represents Layer Normalization. (b) CFA, CA and SA represent Channel Attention and Spatial Attention, respectively. (c) ReGIA, "Down" and "Up" denote downsampling and upsampling, respectively. In low-resolution, $1\times1$ corresponds to $6\times6$ spatial feature in the original resolution. (d) ASISF, highlighting that the channels in both input and reference features can be diverse.
  • Figure 4: Qualitative comparison between state-of-the-art methods and the SMDR-IS on various datasets.
  • Figure 5: Subjective results of ablation experiments on number of stages used. From left to right: (a) original images, (b)-(e) correspond to lines 1-4 in Table \ref{['table_stage']}.