Table of Contents
Fetching ...

snnTrans-DHZ: A Lightweight Spiking Neural Network Architecture for Underwater Image Dehazing

Vidya Sudevan, Fakhreddine Zayer, Rizwana Kausar, Sajid Javed, Hamad Karki, Giulia De Masi, Jorge Dias

TL;DR

This work introduces snnTrans-DHZ, a lightweight spiking neural network architecture designed for underwater image dehazing that leverages time-expanded RGB sequences processed in parallel with RGB-LAB representations. It integrates ALIF neurons, a spiking transformer-based K estimator, and a Background Light Estimator to produce a soft, haze-free reconstruction, trained via surrogate-gradient backpropagation through time with a task-specific loss. On UIEB and EUVP benchmarks, snnTrans-DHZ achieves PSNR/SSIM values competitive with state-of-the-art methods while using only 0.567 million parameters and orders of magnitude fewer synaptic operations and energy than CNN/GAN-based baselines. The approach offers a compelling balance of restoration quality and energy efficiency, enabling real-time deployment on energy-constrained underwater robots and potential execution on neuromorphic hardware such as Loihi. Overall, the paper demonstrates that fully spiking, transformer-enabled architectures can deliver practical, low-power underwater image enhancement with robust perceptual quality.

Abstract

Underwater image dehazing is critical for vision-based marine operations because light scattering and absorption can severely reduce visibility. This paper introduces snnTrans-DHZ, a lightweight Spiking Neural Network (SNN) specifically designed for underwater dehazing. By leveraging the temporal dynamics of SNNs, snnTrans-DHZ efficiently processes time-dependent raw image sequences while maintaining low power consumption. Static underwater images are first converted into time-dependent sequences by repeatedly inputting the same image over user-defined timesteps. These RGB sequences are then transformed into LAB color space representations and processed concurrently. The architecture features three key modules: (i) a K estimator that extracts features from multiple color space representations; (ii) a Background Light Estimator that jointly infers the background light component from the RGB-LAB images; and (iii) a soft image reconstruction module that produces haze-free, visibility-enhanced outputs. The snnTrans-DHZ model is directly trained using a surrogate gradient-based backpropagation through time (BPTT) strategy alongside a novel combined loss function. Evaluated on the UIEB benchmark, snnTrans-DHZ achieves a PSNR of 21.68 dB and an SSIM of 0.8795, and on the EUVP dataset, it yields a PSNR of 23.46 dB and an SSIM of 0.8439. With only 0.5670 million network parameters, and requiring just 7.42 GSOPs and 0.0151 J of energy, the algorithm significantly outperforms existing state-of-the-art methods in terms of efficiency. These features make snnTrans-DHZ highly suitable for deployment in underwater robotics, marine exploration, and environmental monitoring.

snnTrans-DHZ: A Lightweight Spiking Neural Network Architecture for Underwater Image Dehazing

TL;DR

This work introduces snnTrans-DHZ, a lightweight spiking neural network architecture designed for underwater image dehazing that leverages time-expanded RGB sequences processed in parallel with RGB-LAB representations. It integrates ALIF neurons, a spiking transformer-based K estimator, and a Background Light Estimator to produce a soft, haze-free reconstruction, trained via surrogate-gradient backpropagation through time with a task-specific loss. On UIEB and EUVP benchmarks, snnTrans-DHZ achieves PSNR/SSIM values competitive with state-of-the-art methods while using only 0.567 million parameters and orders of magnitude fewer synaptic operations and energy than CNN/GAN-based baselines. The approach offers a compelling balance of restoration quality and energy efficiency, enabling real-time deployment on energy-constrained underwater robots and potential execution on neuromorphic hardware such as Loihi. Overall, the paper demonstrates that fully spiking, transformer-enabled architectures can deliver practical, low-power underwater image enhancement with robust perceptual quality.

Abstract

Underwater image dehazing is critical for vision-based marine operations because light scattering and absorption can severely reduce visibility. This paper introduces snnTrans-DHZ, a lightweight Spiking Neural Network (SNN) specifically designed for underwater dehazing. By leveraging the temporal dynamics of SNNs, snnTrans-DHZ efficiently processes time-dependent raw image sequences while maintaining low power consumption. Static underwater images are first converted into time-dependent sequences by repeatedly inputting the same image over user-defined timesteps. These RGB sequences are then transformed into LAB color space representations and processed concurrently. The architecture features three key modules: (i) a K estimator that extracts features from multiple color space representations; (ii) a Background Light Estimator that jointly infers the background light component from the RGB-LAB images; and (iii) a soft image reconstruction module that produces haze-free, visibility-enhanced outputs. The snnTrans-DHZ model is directly trained using a surrogate gradient-based backpropagation through time (BPTT) strategy alongside a novel combined loss function. Evaluated on the UIEB benchmark, snnTrans-DHZ achieves a PSNR of 21.68 dB and an SSIM of 0.8795, and on the EUVP dataset, it yields a PSNR of 23.46 dB and an SSIM of 0.8439. With only 0.5670 million network parameters, and requiring just 7.42 GSOPs and 0.0151 J of energy, the algorithm significantly outperforms existing state-of-the-art methods in terms of efficiency. These features make snnTrans-DHZ highly suitable for deployment in underwater robotics, marine exploration, and environmental monitoring.

Paper Structure

This paper contains 29 sections, 44 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Computational graph of LIF neuron over the timesteps.
  • Figure 2: Spike coding using LIF neuron with adaptive threshold: In the first stage, feature maps are generated when each channel of the input RGB images, and undergoes convolution with respective kernel. The output is then fed to a spiking neuron, where the membrane potential ($V$) is updated depending on the presynaptic input. When $V$ exceeds the threshold membrane potential ($V_{th}$), a spike is fired, and $V$ resets to a predefined value. In the adaptive threshold mechanism, $V_{th}$ is updated during backpropagation.
  • Figure 3: Schematic representation of snnTrans-DHZ framework for underwater image dehazing. The spike coding is performed simultaneously on the RGB and LAB color space converted time-dependent raw image sequences. In the K estimator module, the low-level features from RGB and LAB color spaces are extracted separately using the spike transformer. They are concatenated and fed to the deconvolution layer. The separately extracted feature maps at different spatial resolutions are fused and fed to the corresponding deconvolution layers. In the background light estimator module, RGB-LAB spike-coded signals are concatenated and fed to the convolution layers, followed by ALIF neurons for feature extraction. The $\textbf{K}$ and $\textbf{B}$ estimates, together with $\textbf{X}^{RGB}_{img}$, are fed to the soft image reconstruction module for the reconstruction of haze-free images. The ALIF neurons with a learnable threshold membrane potential are used throughout the network.
  • Figure 4: Key differences of "Vanilla Self-Attention (VSA)" and Adaptive Spike-Based Self-Attention (ASBSA). (a) VSA follows the traditional attention mechanism, where input $X$ is mapped into "query ($Q$), key ($K$), and value ($V$)" representations, followed by scaled dot-product attention and softmax normalization; (b) SBSA introduces ALIF neurons with a learnable threshold potential and spike-based transformation module $G^{Trans}$ replacing traditional linear layers. This enhances energy efficiency and sparsity by leveraging spiking neural dynamics. The attention computation retains matrix multiplications but eliminates the scale and softmax operations.
  • Figure 5: Qualitative analysis of the proposed snnTrans-DHZ framework on six different images from UIEB test samples. The top row indicates the unprocessed, raw underwater input images. The network-predicted images using the UIE-SNN and snnTrans-DHZ frameworks are shown in the second and third rows respectively. The last row represents the visibility-enhanced reference images.
  • ...and 1 more figures