Table of Contents
Fetching ...

Underwater Image Enhancement by Convolutional Spiking Neural Networks

Vidya Sudevan, Fakhreddine Zayer, Rizwana Kausar, Sajid Javed, Hamad Karki, Giulia De Masi, Jorge Dias

TL;DR

The paper tackles underwater image enhancement by introducing UIE-SNN, the first convolutional spiking neural network for UIE. It presents a 19-layer spiking encoder–decoder with skip connections, trained end-to-end via surrogate-gradient backpropagation through time, achieving energy savings of about $85\%$ while maintaining competitive PSNR and SSIM on UIEB and EUVP, even on unseen datasets. Key contributions include direct membrane-potential–based training, a spike-output-to-spike-output skip strategy, and thorough ablations showing optimal threshold $V_{th}=0.25$, timesteps $T=5$, and depth $=4$. The work demonstrates strong potential for energy-efficient underwater vision on edge or neuromorphic hardware, with public code to enable reproducibility and further research.

Abstract

Underwater image enhancement (UIE) is fundamental for marine applications, including autonomous vision-based navigation. Deep learning methods using convolutional neural networks (CNN) and vision transformers advanced UIE performance. Recently, spiking neural networks (SNN) have gained attention for their lightweight design, energy efficiency, and scalability. This paper introduces UIE-SNN, the first SNN-based UIE algorithm to improve visibility of underwater images. UIE-SNN is a 19- layered convolutional spiking encoder-decoder framework with skip connections, directly trained using surrogate gradient-based backpropagation through time (BPTT) strategy. We explore and validate the influence of training datasets on energy reduction, a unique advantage of UIE-SNN architecture, in contrast to the conventional learning-based architectures, where energy consumption is model-dependent. UIE-SNN optimizes the loss function in latent space representation to reconstruct clear underwater images. Our algorithm performs on par with its non-spiking counterpart methods in terms of PSNR and structural similarity index (SSIM) at reduced timesteps ($T=5$) and energy consumption of $85\%$. The algorithm is trained on two publicly available benchmark datasets, UIEB and EUVP, and tested on unseen images from UIEB, EUVP, LSUI, U45, and our custom UIE dataset. The UIE-SNN algorithm achieves PSNR of \(17.7801~dB\) and SSIM of \(0.7454\) on UIEB, and PSNR of \(23.1725~dB\) and SSIM of \(0.7890\) on EUVP. UIE-SNN achieves this algorithmic performance with fewer operators (\(147.49\) GSOPs) and energy (\(0.1327~J\)) compared to its non-spiking counterpart (GFLOPs = \(218.88\) and Energy=\(1.0068~J\)). Compared with existing SOTA UIE methods, UIE-SNN achieves an average of \(6.5\times\) improvement in energy efficiency. The source code is available at \href{https://github.com/vidya-rejul/UIE-SNN.git}{UIE-SNN}.

Underwater Image Enhancement by Convolutional Spiking Neural Networks

TL;DR

The paper tackles underwater image enhancement by introducing UIE-SNN, the first convolutional spiking neural network for UIE. It presents a 19-layer spiking encoder–decoder with skip connections, trained end-to-end via surrogate-gradient backpropagation through time, achieving energy savings of about while maintaining competitive PSNR and SSIM on UIEB and EUVP, even on unseen datasets. Key contributions include direct membrane-potential–based training, a spike-output-to-spike-output skip strategy, and thorough ablations showing optimal threshold , timesteps , and depth . The work demonstrates strong potential for energy-efficient underwater vision on edge or neuromorphic hardware, with public code to enable reproducibility and further research.

Abstract

Underwater image enhancement (UIE) is fundamental for marine applications, including autonomous vision-based navigation. Deep learning methods using convolutional neural networks (CNN) and vision transformers advanced UIE performance. Recently, spiking neural networks (SNN) have gained attention for their lightweight design, energy efficiency, and scalability. This paper introduces UIE-SNN, the first SNN-based UIE algorithm to improve visibility of underwater images. UIE-SNN is a 19- layered convolutional spiking encoder-decoder framework with skip connections, directly trained using surrogate gradient-based backpropagation through time (BPTT) strategy. We explore and validate the influence of training datasets on energy reduction, a unique advantage of UIE-SNN architecture, in contrast to the conventional learning-based architectures, where energy consumption is model-dependent. UIE-SNN optimizes the loss function in latent space representation to reconstruct clear underwater images. Our algorithm performs on par with its non-spiking counterpart methods in terms of PSNR and structural similarity index (SSIM) at reduced timesteps () and energy consumption of . The algorithm is trained on two publicly available benchmark datasets, UIEB and EUVP, and tested on unseen images from UIEB, EUVP, LSUI, U45, and our custom UIE dataset. The UIE-SNN algorithm achieves PSNR of and SSIM of on UIEB, and PSNR of and SSIM of on EUVP. UIE-SNN achieves this algorithmic performance with fewer operators ( GSOPs) and energy () compared to its non-spiking counterpart (GFLOPs = and Energy=). Compared with existing SOTA UIE methods, UIE-SNN achieves an average of improvement in energy efficiency. The source code is available at \href{https://github.com/vidya-rejul/UIE-SNN.git}{UIE-SNN}.

Paper Structure

This paper contains 28 sections, 20 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: (a) Overview of UIE-SNN framework: The raw images are first converted into time-dependent image sequences. These sequences are then fed to the convolutional spiking-based encoder-decoder structure to extract high-level features and thereby reconstruct the desired visibility-enhanced images by minimizing the artifacts., and (b) Evaluation of UIE-SNN with its CNN counterpart: The proposed UIE-SNN framework is demonstrating comparable algorithmic performance at significantly reduced GFLOPs and energy consumption with its CNN counterpart.
  • Figure 2: Detailed UIE-SNN Model: In phase (A), raw images are converted into time-dependent sequences. Phase (B) involves the first CL and LIF neuron, transforming pixel values into sparse spike representations while extracting contextual features in the encoder block. Phase (C) shows the latent space, and in phase (D), the visibility-enhanced image is reconstructed by minimizing reconstruction loss.
  • Figure 3: Computational graph of LIF neuron over the timesteps.
  • Figure 4: Overview of spike encoding: In the first stage, feature maps are generated when each channel of the input RGB images undergoes convolution with a respective kernel. These convolved data is then fed into spiking neuron, where the membrane potential ($V$) is updated over the timesteps. When $V$ exceeds the threshold membrane potential ($V_{th}$), a spike is fired, and $V$ resets to a predefined value.
  • Figure 5: Detailed structural description of UIE-SNN for a single timestep: At a single timestep, the continuous-valued input image is first converted into its equivalent spike representation at the first convolution layer foloowed by the LIF neuron. The encoder path consists of four SEB blocks to extract the high-level spatial and temporal information. At the latent space representation, additional high-level features are extracted using convolutional spiking blocks.In the decoding phase, SDB blocks progressively up-sample the feature maps, integrating skip connections from the encoder to preserve spatial features, and finally the visibility enhanced image with the same spatial resolution of the input image is reconstructed using the final convolutional spiking layer. The size of output feature map at the end of each layer is presented.
  • ...and 9 more figures