Table of Contents
Fetching ...

DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image Enhancement

Jingchun Zhou, Zongxin He, Qiuping Jiang, Kui Jiang, Xianping Fu, Xuelong Li

TL;DR

DGNet tackles underwater image enhancement under complex multi-factor degradation by introducing a dynamic gradient-guided training paradigm that combines self-updated pseudo-labels with a reference gradient. The method features two lightweight modules, FRR and FRS, to decouple degradation factors, where FRR employs Channel Combination Inference (CCI) and Fusion Sense Module (FSM) and FRS applies a fixed-weight Laplacian smoothing to suppress artifacts. A dynamic loss term $L_d^{\tau}$ works alongside conventional losses $L_1$ and $L_{SSIM}$ to steer optimization toward robust restoration and white balance. Evaluations on UIEB and other datasets show state-of-the-art PSNR/SSIM with significantly reduced parameters and faster inference (two model sizes: Our-S and Our-L), demonstrating strong generalization and practical viability for real-world underwater imaging tasks. The approach offers a practical framework for robust UIE under diverse noise, motion, and illumination conditions, with potential benefits for downstream vision tasks in underwater environments.

Abstract

Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments. To solve this issue, previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features, limiting the generalization and adaptability of the model. Previous methods use the reference gradient that is constructed from original images and synthetic ground-truth images. This may cause the network performance to be influenced by some low-quality training data. Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space. This process improves image quality and avoids local optima. Moreover, we propose a Feature Restoration and Reconstruction module (FRR) based on a Channel Combination Inference (CCI) strategy and a Frequency Domain Smoothing module (FRS). These modules decouple other degradation features while reducing the impact of various types of noise on network performance. Experiments on multiple public datasets demonstrate the superiority of our method over existing state-of-the-art approaches, especially in achieving performance milestones: PSNR of 25.6dB and SSIM of 0.93 on the UIEB dataset. Its efficiency in terms of parameter size and inference time further attests to its broad practicality. The code will be made publicly available.

DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image Enhancement

TL;DR

DGNet tackles underwater image enhancement under complex multi-factor degradation by introducing a dynamic gradient-guided training paradigm that combines self-updated pseudo-labels with a reference gradient. The method features two lightweight modules, FRR and FRS, to decouple degradation factors, where FRR employs Channel Combination Inference (CCI) and Fusion Sense Module (FSM) and FRS applies a fixed-weight Laplacian smoothing to suppress artifacts. A dynamic loss term works alongside conventional losses and to steer optimization toward robust restoration and white balance. Evaluations on UIEB and other datasets show state-of-the-art PSNR/SSIM with significantly reduced parameters and faster inference (two model sizes: Our-S and Our-L), demonstrating strong generalization and practical viability for real-world underwater imaging tasks. The approach offers a practical framework for robust UIE under diverse noise, motion, and illumination conditions, with potential benefits for downstream vision tasks in underwater environments.

Abstract

Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments. To solve this issue, previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features, limiting the generalization and adaptability of the model. Previous methods use the reference gradient that is constructed from original images and synthetic ground-truth images. This may cause the network performance to be influenced by some low-quality training data. Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space. This process improves image quality and avoids local optima. Moreover, we propose a Feature Restoration and Reconstruction module (FRR) based on a Channel Combination Inference (CCI) strategy and a Frequency Domain Smoothing module (FRS). These modules decouple other degradation features while reducing the impact of various types of noise on network performance. Experiments on multiple public datasets demonstrate the superiority of our method over existing state-of-the-art approaches, especially in achieving performance milestones: PSNR of 25.6dB and SSIM of 0.93 on the UIEB dataset. Its efficiency in terms of parameter size and inference time further attests to its broad practicality. The code will be made publicly available.
Paper Structure (20 sections, 6 equations, 7 figures, 4 tables)

This paper contains 20 sections, 6 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Performance comparison between state-of-the-art methods R38ReX-NetR36NAFNetR26UIECR33PUIENetR25UcolorR37SemiUIRR39HAAMGANR35RestormerR34MFEM and our methods on the UIEB V90 dataset R18WaterNet. Evaluation metrics include PSNR for image quality, computational complexity (GFLOPs), and parameter size represented by ball size. Our method achieves remarkable performance milestones.
  • Figure 2: Overall structure of DGNet. The FRR deals with noise and other degradation factors, while the FRS module smooths features. FRR consists of Channel Convolution Inference (CCI) and Feature Smoothing Module (FSM) (detail in Fig. \ref{['FRR_detail']}). We construct a dynamic gradient with the predicted image, which along with the reference gradient, jointly supervises the network. The CBM consists of a convolutional, BN layer with MiSH activation function.
  • Figure 3: Detail in FRR module. (a) Channel Combination Inference (CCI). (b) Fusion Sense Module (FSM). The blue block is the CBM that consists of a convolution, BN layer with MiSH activation function. In the CCI block, we replace normal convolution with group convolution.
  • Figure 4: Structure of Sense block. Featuring LaplaConv with fixed-weight laplace convolution.
  • Figure 5: Comparative visualization: (a) original image, (b) images processed by FRR module, and (c) images processed by FRS module. (b) and (c) are generated by the proposed DGNet. They could be noticed that the FRS-processed image reduces the artifacts.
  • ...and 2 more figures