Table of Contents
Fetching ...

Harnessing Multi-resolution and Multi-scale Attention for Underwater Image Restoration

Alik Pramanick, Arijit Sur, V. Vijaya Saradhi

TL;DR

This work tackles underwater image degradation by proposing Lit-Net, a lightweight, multi-stage network that preserves original resolution in the first stage, refines features in a second stage, and reconstructs in the final stage. It introduces MRAN and MSAN to achieve concurrent multi-resolution and multi-scale analysis, using parallel 1×1 encoder branches and attention-based skip connections to maintain spatial precision and semantic richness. A tailored loss combination—weighted color-channel L1 (cl$_1$), perceptual loss, and SSIM loss—drives color fidelity and texture preservation, yielding state-of-the-art PSNR/SSIM on EUVP, UIEB, and SUIM-E, with strong qualitative results and favorable perceptual metrics. The approach demonstrates tangible benefits for downstream underwater perception tasks like semantic segmentation and object detection, suggesting practical impact for AUVs and surveillance, with code available at GitHub for reproducibility.

Abstract

Underwater imagery is often compromised by factors such as color distortion and low contrast, posing challenges for high-level vision tasks. Recent underwater image restoration (UIR) methods either analyze the input image at full resolution, resulting in spatial richness but contextual weakness, or progressively from high to low resolution, yielding reliable semantic information but reduced spatial accuracy. Here, we propose a lightweight multi-stage network called Lit-Net that focuses on multi-resolution and multi-scale image analysis for restoring underwater images while retaining original resolution during the first stage, refining features in the second, and focusing on reconstruction in the final stage. Our novel encoder block utilizes parallel $1\times1$ convolution layers to capture local information and speed up operations. Further, we incorporate a modified weighted color channel-specific $l_1$ loss ($cl_1$) function to recover color and detail information. Extensive experimentations on publicly available datasets suggest our model's superiority over recent state-of-the-art methods, with significant improvement in qualitative and quantitative measures, such as $29.477$ dB PSNR ($1.92\%$ improvement) and $0.851$ SSIM ($2.87\%$ improvement) on the EUVP dataset. The contributions of Lit-Net offer a more robust approach to underwater image enhancement and super-resolution, which is of considerable importance for underwater autonomous vehicles and surveillance. The code is available at: https://github.com/Alik033/Lit-Net.

Harnessing Multi-resolution and Multi-scale Attention for Underwater Image Restoration

TL;DR

This work tackles underwater image degradation by proposing Lit-Net, a lightweight, multi-stage network that preserves original resolution in the first stage, refines features in a second stage, and reconstructs in the final stage. It introduces MRAN and MSAN to achieve concurrent multi-resolution and multi-scale analysis, using parallel 1×1 encoder branches and attention-based skip connections to maintain spatial precision and semantic richness. A tailored loss combination—weighted color-channel L1 (cl), perceptual loss, and SSIM loss—drives color fidelity and texture preservation, yielding state-of-the-art PSNR/SSIM on EUVP, UIEB, and SUIM-E, with strong qualitative results and favorable perceptual metrics. The approach demonstrates tangible benefits for downstream underwater perception tasks like semantic segmentation and object detection, suggesting practical impact for AUVs and surveillance, with code available at GitHub for reproducibility.

Abstract

Underwater imagery is often compromised by factors such as color distortion and low contrast, posing challenges for high-level vision tasks. Recent underwater image restoration (UIR) methods either analyze the input image at full resolution, resulting in spatial richness but contextual weakness, or progressively from high to low resolution, yielding reliable semantic information but reduced spatial accuracy. Here, we propose a lightweight multi-stage network called Lit-Net that focuses on multi-resolution and multi-scale image analysis for restoring underwater images while retaining original resolution during the first stage, refining features in the second, and focusing on reconstruction in the final stage. Our novel encoder block utilizes parallel convolution layers to capture local information and speed up operations. Further, we incorporate a modified weighted color channel-specific loss () function to recover color and detail information. Extensive experimentations on publicly available datasets suggest our model's superiority over recent state-of-the-art methods, with significant improvement in qualitative and quantitative measures, such as dB PSNR ( improvement) and SSIM ( improvement) on the EUVP dataset. The contributions of Lit-Net offer a more robust approach to underwater image enhancement and super-resolution, which is of considerable importance for underwater autonomous vehicles and surveillance. The code is available at: https://github.com/Alik033/Lit-Net.
Paper Structure (37 sections, 16 equations, 15 figures, 11 tables)

This paper contains 37 sections, 16 equations, 15 figures, 11 tables.

Figures (15)

  • Figure 1: First row represents the tricolor histogram and pie chart of the degraded underwater image, second row represents the tricolor histogram and pie chart of the enhanced underwater image by our Lit-Net, and third row represent the tricolor histogram and pie chart of the ground-truth image.
  • Figure 2: Visual comparisons of (a) enhancement example on low contrast underwater image, (b) 4x super-resolve example on underwater image. The proposed Lit-Net model enhanced the contrast and removes the color deviation.
  • Figure 3: Overview of our proposed model for UIE and UISR. The model accepts a degraded underwater image as input and produces an improved image that is visually and spatially enhanced.
  • Figure 4: Network architecture of CBAM block woo2018cbam.
  • Figure 5: Network architecture of proposed encoder and decoder layer.
  • ...and 10 more figures