Table of Contents
Fetching ...

LUIEO: A Lightweight Model for Integrating Underwater Image Enhancement and Object Detection

Bin Li, Li Li, Zhenwei Zhang, Yuping Duan

TL;DR

LUIEO tackles underwater object detection under degrading visibility by jointly optimizing image enhancement and detection in a lightweight, model-driven framework. It couples a physics-guided enhancement module that decomposes images into J, B, and t with a detection branch that fuses multi-scale features on enhanced maps, using an anchor-free head for efficiency. A refined synthetic underwater dataset and self-supervision from the underwater imaging model enable simultaneous training, while a balanced loss ties enhancement and detection objectives. The approach delivers improved mAP and visual quality with real-time performance, highlighting the practical impact for marine monitoring on resource-constrained platforms, though domain gaps between synthetic and real data are acknowledged and future multimodal fusion is proposed.

Abstract

Underwater optical images inevitably suffer from various degradation factors such as blurring, low contrast, and color distortion, which hinder the accuracy of object detection tasks. Due to the lack of paired underwater/clean images, most research methods adopt a strategy of first enhancing and then detecting, resulting in a lack of feature communication between the two learning tasks. On the other hand, due to the contradiction between the diverse degradation factors of underwater images and the limited number of samples, existing underwater enhancement methods are difficult to effectively enhance degraded images of unknown water bodies, thereby limiting the improvement of object detection accuracy. Therefore, most underwater target detection results are still displayed on degraded images, making it difficult to visually judge the correctness of the detection results. To address the above issues, this paper proposes a multi-task learning method that simultaneously enhances underwater images and improves detection accuracy. Compared with single-task learning, the integrated model allows for the dynamic adjustment of information communication and sharing between different tasks. Due to the fact that real underwater images can only provide annotated object labels, this paper introduces physical constraints to ensure that object detection tasks do not interfere with image enhancement tasks. Therefore, this article introduces a physical module to decompose underwater images into clean images, background light, and transmission images and uses a physical model to calculate underwater images for self-supervision. Numerical experiments demonstrate that the proposed model achieves satisfactory results in visual performance, object detection accuracy, and detection efficiency compared to state-of-the-art comparative methods.

LUIEO: A Lightweight Model for Integrating Underwater Image Enhancement and Object Detection

TL;DR

LUIEO tackles underwater object detection under degrading visibility by jointly optimizing image enhancement and detection in a lightweight, model-driven framework. It couples a physics-guided enhancement module that decomposes images into J, B, and t with a detection branch that fuses multi-scale features on enhanced maps, using an anchor-free head for efficiency. A refined synthetic underwater dataset and self-supervision from the underwater imaging model enable simultaneous training, while a balanced loss ties enhancement and detection objectives. The approach delivers improved mAP and visual quality with real-time performance, highlighting the practical impact for marine monitoring on resource-constrained platforms, though domain gaps between synthetic and real data are acknowledged and future multimodal fusion is proposed.

Abstract

Underwater optical images inevitably suffer from various degradation factors such as blurring, low contrast, and color distortion, which hinder the accuracy of object detection tasks. Due to the lack of paired underwater/clean images, most research methods adopt a strategy of first enhancing and then detecting, resulting in a lack of feature communication between the two learning tasks. On the other hand, due to the contradiction between the diverse degradation factors of underwater images and the limited number of samples, existing underwater enhancement methods are difficult to effectively enhance degraded images of unknown water bodies, thereby limiting the improvement of object detection accuracy. Therefore, most underwater target detection results are still displayed on degraded images, making it difficult to visually judge the correctness of the detection results. To address the above issues, this paper proposes a multi-task learning method that simultaneously enhances underwater images and improves detection accuracy. Compared with single-task learning, the integrated model allows for the dynamic adjustment of information communication and sharing between different tasks. Due to the fact that real underwater images can only provide annotated object labels, this paper introduces physical constraints to ensure that object detection tasks do not interfere with image enhancement tasks. Therefore, this article introduces a physical module to decompose underwater images into clean images, background light, and transmission images and uses a physical model to calculate underwater images for self-supervision. Numerical experiments demonstrate that the proposed model achieves satisfactory results in visual performance, object detection accuracy, and detection efficiency compared to state-of-the-art comparative methods.

Paper Structure

This paper contains 23 sections, 16 equations, 13 figures, 8 tables.

Figures (13)

  • Figure 1: The radiance perceived by the camera $I_\lambda$ is the sum of direct signal and background scattering. The black arrow represents the direct signal containing scene information, while the dashed arrow represents the scattered signal reflected by underwater suspended particles.
  • Figure 2: A lightweight model integrates image enhancement and object detection. Here, the inverse residual structures are represented as MV2. The image enhancement task divides an underwater image into a clean image, background light, and transmission maps, facilitating the self-supervised enhancement of real underwater images. During the image enhancement decoding process, a path aggregation module is introduced to fuse multi-scale feature maps, and a decoupled anchor-free detection head is employed to identify underwater targets.
  • Figure 3: Illustration of residual network structure, inverted residual network structure, and spatial pyramid pooling fast (SPPF) structure.
  • Figure 4: Illustration of MobileViT architecture combining CNN with Transformer.
  • Figure 5: The enhanced results of proposed model for common underwater degradation types.
  • ...and 8 more figures