Table of Contents
Fetching ...

HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning

Zengxi Zhang, Zhiying Jiang, Long Ma, Jinyuan Liu, Xin Fan, Risheng Liu

TL;DR

This work tackles the gap between visual enhancement and downstream perception in underwater imagery by introducing HUPE, a heuristic invertible network that enables an information-preserving, bidirectional translation between degraded underwater images and their clear counterparts. It integrates a Frequency-Aware Affine Coupling mechanism operating in both spatial and frequency domains and augments learning with gradient/depth priors through a Heuristic Prior Injector, all guided by a Semantic Collaborative Learning module to extract task-oriented semantics. The method is supported by a comprehensive loss design that enforces fidelity, frequency alignment, and reversibility, and is validated through extensive experiments on multiple datasets, showing superior performance in both image quality and downstream detection/segmentation tasks. The approach promises practical impact for underwater robotics and perception systems by delivering robust, task-friendly image enhancements across diverse underwater conditions.

Abstract

Underwater images are often affected by light refraction and absorption, reducing visibility and interfering with subsequent applications. Existing underwater image enhancement methods primarily focus on improving visual quality while overlooking practical implications. To strike a balance between visual quality and application, we propose a heuristic invertible network for underwater perception enhancement, dubbed HUPE, which enhances visual quality and demonstrates flexibility in handling other downstream tasks. Specifically, we introduced an information-preserving reversible transformation with embedded Fourier transform to establish a bidirectional mapping between underwater images and their clear images. Additionally, a heuristic prior is incorporated into the enhancement process to better capture scene information. To further bridge the feature gap between vision-based enhancement images and application-oriented images, a semantic collaborative learning module is applied in the joint optimization process of the visual enhancement task and the downstream task, which guides the proposed enhancement model to extract more task-oriented semantic features while obtaining visually pleasing images. Extensive experiments, both quantitative and qualitative, demonstrate the superiority of our HUPE over state-of-the-art methods. The source code is available at https://github.com/ZengxiZhang/HUPE.

HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning

TL;DR

This work tackles the gap between visual enhancement and downstream perception in underwater imagery by introducing HUPE, a heuristic invertible network that enables an information-preserving, bidirectional translation between degraded underwater images and their clear counterparts. It integrates a Frequency-Aware Affine Coupling mechanism operating in both spatial and frequency domains and augments learning with gradient/depth priors through a Heuristic Prior Injector, all guided by a Semantic Collaborative Learning module to extract task-oriented semantics. The method is supported by a comprehensive loss design that enforces fidelity, frequency alignment, and reversibility, and is validated through extensive experiments on multiple datasets, showing superior performance in both image quality and downstream detection/segmentation tasks. The approach promises practical impact for underwater robotics and perception systems by delivering robust, task-friendly image enhancements across diverse underwater conditions.

Abstract

Underwater images are often affected by light refraction and absorption, reducing visibility and interfering with subsequent applications. Existing underwater image enhancement methods primarily focus on improving visual quality while overlooking practical implications. To strike a balance between visual quality and application, we propose a heuristic invertible network for underwater perception enhancement, dubbed HUPE, which enhances visual quality and demonstrates flexibility in handling other downstream tasks. Specifically, we introduced an information-preserving reversible transformation with embedded Fourier transform to establish a bidirectional mapping between underwater images and their clear images. Additionally, a heuristic prior is incorporated into the enhancement process to better capture scene information. To further bridge the feature gap between vision-based enhancement images and application-oriented images, a semantic collaborative learning module is applied in the joint optimization process of the visual enhancement task and the downstream task, which guides the proposed enhancement model to extract more task-oriented semantic features while obtaining visually pleasing images. Extensive experiments, both quantitative and qualitative, demonstrate the superiority of our HUPE over state-of-the-art methods. The source code is available at https://github.com/ZengxiZhang/HUPE.

Paper Structure

This paper contains 24 sections, 11 equations, 21 figures, 2 tables.

Figures (21)

  • Figure 1: Workflow of the proposed HUPE. Physical model-related depth and gradient information are embedded into the reversible mapping of degraded images and clear counterparts to assist the network in generating credible enhanced results. A semantic collaborative learning module is introduced to assist the enhancement network to maximize the retention and extraction of the semantic structure of the image.
  • Figure 2: Workflow of the proposed Spatial-Frequency Affine Block.
  • Figure 3: Workflow of the proposed Semantic Collaborative Learning Module.
  • Figure 4: Visual comparison on UIEBD li2019waternetuiebd dataset. We further conduct the pixel distribution of enhanced results and reference images in the uniform region. Obviously, the proposed method performance the best in both visualization and distribution comparison.
  • Figure 5: Visualization comparison on UCCS liu2020uccs dataset. We further conduct the histogram distribution of RGB color channel of the dataset. The x and y axis of the histogram respectively represent the pixel intensity and the probability distribution. It is obvious that the image obtained by the proposed method enhances the color distribution closest to the in-air image.
  • ...and 16 more figures