Table of Contents
Fetching ...

A 7K Parameter Model for Underwater Image Enhancement based on Transmission Map Prior

Fuheng Zhou, Dikai Wei, Ye Fan, Yulong Huang, Yonggang Zhang

TL;DR

This paper tackles underwater image enhancement under strict resource constraints by introducing LSNet, a 7K-parameter network that avoids latent-space encoding. LSNet leverages transmission-map priors and a novel top-k selective attention module to decompose the enhancement into a compensation term and an over-exposure attenuation term, expressed as $J(x)=I(x)+I_{compensate}(x)-I_{exposed}(x)$. The approach achieves competitive quality on multiple benchmarks with far fewer parameters than state-of-the-art models, highlighting significant gains in efficiency and practicality for on-device deployment. The work demonstrates the viability of transmission-map-inspired, lightweight architectures for underwater restoration and provides ablation evidence of the key components contributing to performance, along with a discussion of limitations and future refinements.

Abstract

Although deep learning based models for underwater image enhancement have achieved good performance, they face limitations in both lightweight and effectiveness, which prevents their deployment and application on resource-constrained platforms. Moreover, most existing deep learning based models use data compression to get high-level semantic information in latent space instead of using the original information. Therefore, they require decoder blocks to generate the details of the output. This requires additional computational cost. In this paper, a lightweight network named lightweight selective attention network (LSNet) based on the top-k selective attention and transmission maps mechanism is proposed. The proposed model achieves a PSNR of 97\% with only 7K parameters compared to a similar attention-based model. Extensive experiments show that the proposed LSNet achieves excellent performance in state-of-the-art models with significantly fewer parameters and computational resources. The code is available at https://github.com/FuhengZhou/LSNet}{https://github.com/FuhengZhou/LSNet.

A 7K Parameter Model for Underwater Image Enhancement based on Transmission Map Prior

TL;DR

This paper tackles underwater image enhancement under strict resource constraints by introducing LSNet, a 7K-parameter network that avoids latent-space encoding. LSNet leverages transmission-map priors and a novel top-k selective attention module to decompose the enhancement into a compensation term and an over-exposure attenuation term, expressed as . The approach achieves competitive quality on multiple benchmarks with far fewer parameters than state-of-the-art models, highlighting significant gains in efficiency and practicality for on-device deployment. The work demonstrates the viability of transmission-map-inspired, lightweight architectures for underwater restoration and provides ablation evidence of the key components contributing to performance, along with a discussion of limitations and future refinements.

Abstract

Although deep learning based models for underwater image enhancement have achieved good performance, they face limitations in both lightweight and effectiveness, which prevents their deployment and application on resource-constrained platforms. Moreover, most existing deep learning based models use data compression to get high-level semantic information in latent space instead of using the original information. Therefore, they require decoder blocks to generate the details of the output. This requires additional computational cost. In this paper, a lightweight network named lightweight selective attention network (LSNet) based on the top-k selective attention and transmission maps mechanism is proposed. The proposed model achieves a PSNR of 97\% with only 7K parameters compared to a similar attention-based model. Extensive experiments show that the proposed LSNet achieves excellent performance in state-of-the-art models with significantly fewer parameters and computational resources. The code is available at https://github.com/FuhengZhou/LSNet}{https://github.com/FuhengZhou/LSNet.
Paper Structure (15 sections, 12 equations, 7 figures, 2 tables)

This paper contains 15 sections, 12 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: The visualization of raw and reference images and their histograms of red, green, and blue channels. In each histogram picture, the red curve represents the raw image and the green curve represents the reference image. Compared with reference images, it can be easily found that the pixels of the red channel and blue channel of the raw image are clustered on the left, while the green channel is more uniform.
  • Figure 2: The comparison of other deep learning methods and LSNet. It can be found that LSNet does not have any downsampling, so the reconstruction module for high-level features is not needed, while other methods need to generate lost details due to the loss of information caused by the downsampling process.
  • Figure 3: The detail of the LSNet structure. The LSNet uses the chunk function to chunk the inputs from the channel to batch to use the GPU more efficiently.
  • Figure 4: The visualization results of the LSNet. From left to right: (a) origin, (b) FA+Net, (c) FUnIE, (d) NU2Net, (e) P2CNet, (f) UIEC2Net, (g) U-shape, (h) LSNet, (i) The ground truth. It can be seen that LSNet achieves good results while maintaining extremely low parameters.
  • Figure 5: The result of the LSNet and the other different models. The left table represents the results of flops, param, and time of the models. The right part represents the PSNR and parameters of the models. The proposed model LSNet achieves competitive results compared to the similar attention-based model U-shape (TIP 23) by only 0.03% parameters.
  • ...and 2 more figures