Table of Contents
Fetching ...

PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators

Runmin Cong, Wenyu Yang, Wei Zhang, Chongyi Li, Chun-Le Guo, Qingming Huang, Sam Kwong

TL;DR

Underwater images suffer from color distortion and blur due to absorption and scattering. The paper proposes PUGAN, a physical model-guided GAN that combines a Parameters Estimation subnetwork for estimating transmission $t$, attenuation $\beta$, and depth $d$, and a color-enhanced image $J^{'}$, with a Two-Stream Interaction Enhancement subnetwork guided by $J^{'}$ and $t$, and a Degradation Quantization module. Dual-Discriminators enforce style and content authenticity, including depth-aware constraints. Training follows a two-stage scheme on real UIE data and synthetic data, using losses $L_1$, $L_{gdl}$, $L_{GAN1}$, $L_{GAN2}$, and $L_{con}$, with a final objective $L=\lambda_1 L_{GAN1}+\lambda_2 L_{GAN2}+\lambda_3 L_1+\lambda_4 L_{con}$. Experiments on three benchmarks show state-of-the-art PSNR/MSE and competitive non-reference metrics, demonstrating improved color fidelity, detail preservation, and robustness across water types.

Abstract

Due to the light absorption and scattering induced by the water medium, underwater images usually suffer from some degradation problems, such as low contrast, color distortion, and blurring details, which aggravate the difficulty of downstream underwater understanding tasks. Therefore, how to obtain clear and visually pleasant images has become a common concern of people, and the task of underwater image enhancement (UIE) has also emerged as the times require. Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability. Inheriting the advantages of the above two types of models, we propose a physical model-guided GAN model for UIE in this paper, referred to as PUGAN. The entire network is under the GAN architecture. On the one hand, we design a Parameters Estimation subnetwork (Par-subnet) to learn the parameters for physical model inversion, and use the generated color enhancement image as auxiliary information for the Two-Stream Interaction Enhancement sub-network (TSIE-subnet). Meanwhile, we design a Degradation Quantization (DQ) module in TSIE-subnet to quantize scene degradation, thereby achieving reinforcing enhancement of key regions. On the other hand, we design the Dual-Discriminators for the style-content adversarial constraint, promoting the authenticity and visual aesthetics of the results. Extensive experiments on three benchmark datasets demonstrate that our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.

PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators

TL;DR

Underwater images suffer from color distortion and blur due to absorption and scattering. The paper proposes PUGAN, a physical model-guided GAN that combines a Parameters Estimation subnetwork for estimating transmission , attenuation , and depth , and a color-enhanced image , with a Two-Stream Interaction Enhancement subnetwork guided by and , and a Degradation Quantization module. Dual-Discriminators enforce style and content authenticity, including depth-aware constraints. Training follows a two-stage scheme on real UIE data and synthetic data, using losses , , , , and , with a final objective . Experiments on three benchmarks show state-of-the-art PSNR/MSE and competitive non-reference metrics, demonstrating improved color fidelity, detail preservation, and robustness across water types.

Abstract

Due to the light absorption and scattering induced by the water medium, underwater images usually suffer from some degradation problems, such as low contrast, color distortion, and blurring details, which aggravate the difficulty of downstream underwater understanding tasks. Therefore, how to obtain clear and visually pleasant images has become a common concern of people, and the task of underwater image enhancement (UIE) has also emerged as the times require. Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability. Inheriting the advantages of the above two types of models, we propose a physical model-guided GAN model for UIE in this paper, referred to as PUGAN. The entire network is under the GAN architecture. On the one hand, we design a Parameters Estimation subnetwork (Par-subnet) to learn the parameters for physical model inversion, and use the generated color enhancement image as auxiliary information for the Two-Stream Interaction Enhancement sub-network (TSIE-subnet). Meanwhile, we design a Degradation Quantization (DQ) module in TSIE-subnet to quantize scene degradation, thereby achieving reinforcing enhancement of key regions. On the other hand, we design the Dual-Discriminators for the style-content adversarial constraint, promoting the authenticity and visual aesthetics of the results. Extensive experiments on three benchmark datasets demonstrate that our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics.
Paper Structure (20 sections, 19 equations, 8 figures, 4 tables)

This paper contains 20 sections, 19 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Samples of different UIE methods. (a) Original underwater image. (b)-(e) The enhancement results generated by GDCP peng2018generalization, FUnIE-GAN islam2020fast, Ucolor li2021underwater, PUGAN (ours). (f) Ground truth.
  • Figure 2: Overview of the proposed PUGAN for UIE task, including a Phy-G and a Dual-D under the GAN architecture. In the Phy-G, the Par-subnet is used to estimate the physical parameters (e.g., transmission map $t$ and attenuation coefficient $\beta$) required for restoring a color-enhanced image $J^{'}$. The TSIE-subnet aims to achieve the CNN-based end-to-end enhancement, where a degradation quantization (DQ) module is used to quantify the distortion degree of the scene, thereby guiding and generating the final enhanced underwater image $E$. The objective function consists of four parts, including global similarity loss ${L_1}$, perceptual loss ${L_{gdl}}$, style adversarial loss ${L_{GAN_1}}$, and content adversarial loss ${L_{GAN_2}}$.
  • Figure 3: The schematic illustration of Par-subnet. It mainly includes three modules, namely Depth Estimator, Attenuation Coefficient Estimator, and Transmission Estimator. The results of the Depth Estimator and Attention Coefficient Estimator are used to estimate the transmission map. Finally, the estimated transmission map and the original image are used to restore the color-enhanced underwater image through the model inversion.
  • Figure 4: The schematic illustration of TSIE-subnet, following an encoder-decoder structure, in which the two-stream encoder features are transferred to the corresponding decoder layer after passing through the DQ module. The right side of this figure provides the detailed structure of the DQ module.
  • Figure 5: The scores of non-reference metrics are displayed under each visualization result.
  • ...and 3 more figures