Table of Contents
Fetching ...

SINET: Sparsity-driven Interpretable Neural Network for Underwater Image Enhancement

Gargi Panda, Soumitra Kundu, Saumik Bhattacharya, Aurobinda Routray

TL;DR

Underwater image enhancement must overcome scattering and wavelength-dependent attenuation. The authors propose SINET, a sparsity-driven interpretable neural network based on a channel-specific convolutional sparse coding (CCSC) model, with sparse feature estimation blocks (SFEB) that unroll an $\ell_1$-regularized CSC solver for per-channel feature recovery $I_s_i = D_i(z_i)$ and $I_e_i = G_i(z_i)$. The architecture processes each color channel separately and concatenates the results to form $I_e$, achieving a PSNR improvement of $1.05$ dB on LSUI while reducing FLOPs by up to $3873\times$ relative to the strongest baselines. Experiments on UIEB/LSUI/UIEBC demonstrate improved UIE quality, enhanced interpretability through channel-wise sparse features, and substantial efficiency gains, indicating SINET as a practical, model-based approach for underwater imaging.

Abstract

Improving the quality of underwater images is essential for advancing marine research and technology. This work introduces a sparsity-driven interpretable neural network (SINET) for the underwater image enhancement (UIE) task. Unlike pure deep learning methods, our network architecture is based on a novel channel-specific convolutional sparse coding (CCSC) model, ensuring good interpretability of the underlying image enhancement process. The key feature of SINET is that it estimates the salient features from the three color channels using three sparse feature estimation blocks (SFEBs). The architecture of SFEB is designed by unrolling an iterative algorithm for solving the $\ell_1$ regularized convolutional sparse coding (CSC) problem. Our experiments show that SINET surpasses state-of-the-art PSNR value by $1.05$ dB with $3873$ times lower computational complexity. Code can be found at: https://github.com/gargi884/SINET-UIE/tree/main.

SINET: Sparsity-driven Interpretable Neural Network for Underwater Image Enhancement

TL;DR

Underwater image enhancement must overcome scattering and wavelength-dependent attenuation. The authors propose SINET, a sparsity-driven interpretable neural network based on a channel-specific convolutional sparse coding (CCSC) model, with sparse feature estimation blocks (SFEB) that unroll an -regularized CSC solver for per-channel feature recovery and . The architecture processes each color channel separately and concatenates the results to form , achieving a PSNR improvement of dB on LSUI while reducing FLOPs by up to relative to the strongest baselines. Experiments on UIEB/LSUI/UIEBC demonstrate improved UIE quality, enhanced interpretability through channel-wise sparse features, and substantial efficiency gains, indicating SINET as a practical, model-based approach for underwater imaging.

Abstract

Improving the quality of underwater images is essential for advancing marine research and technology. This work introduces a sparsity-driven interpretable neural network (SINET) for the underwater image enhancement (UIE) task. Unlike pure deep learning methods, our network architecture is based on a novel channel-specific convolutional sparse coding (CCSC) model, ensuring good interpretability of the underlying image enhancement process. The key feature of SINET is that it estimates the salient features from the three color channels using three sparse feature estimation blocks (SFEBs). The architecture of SFEB is designed by unrolling an iterative algorithm for solving the regularized convolutional sparse coding (CSC) problem. Our experiments show that SINET surpasses state-of-the-art PSNR value by dB with times lower computational complexity. Code can be found at: https://github.com/gargi884/SINET-UIE/tree/main.
Paper Structure (12 sections, 12 equations, 4 figures, 1 table)

This paper contains 12 sections, 12 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Network architecture of SINET and the structure of SFEB.
  • Figure 2: Visual comparison with SOTA methods. First two rows: Images from the LSUI dataset. Third row: Image from the UIEBC dataset, which has no ground truth (GT) image. Images are best viewed in $400\%$ zoom.
  • Figure 3: Visualization of intermediate features.
  • Figure 4: Ablation experiments with different settings.