Table of Contents
Fetching ...

Deep Multi-Threshold Spiking-UNet for Image Processing

Hebei Li, Yueyi Zhang, Zhiwei Xiong, Xiaoyan Sun

TL;DR

This work addresses the challenge of efficient pixel-wise image processing on neuromorphic hardware by introducing Spiking-UNet, a deep SNN aligned to a pre-trained U-Net. The approach combines multi-threshold spiking neurons, connection-wise weight normalization, and a flow-based fine-tuning pipeline to convert and optimize U-Nets into Spiking-UNets, significantly reducing inference time while maintaining competitive segmentation and denoising performance. Key contributions include a principled MT neuron design with optimal thresholding, a normalization scheme tailored for skip connections, and an accumulated-spiking-flow training method that lowers time steps without sacrificing accuracy. The results demonstrate comparable to U-Net performance and substantial energy savings, highlighting the practical potential of Spiking-UNets for neuromorphic image processing and signaling a path toward further neuromorphic deployments and extensions to additional tasks.

Abstract

U-Net, known for its simple yet efficient architecture, is widely utilized for image processing tasks and is particularly suitable for deployment on neuromorphic chips. This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture. To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy. To address the issue of information loss, we introduce multi-threshold spiking neurons, which improve the efficiency of information transmission within the Spiking-UNet. For the training strategy, we adopt a conversion and fine-tuning pipeline that leverage pre-trained U-Net models. During the conversion process, significant variability in data distribution across different parts is observed when utilizing skip connections. Therefore, we propose a connection-wise normalization method to prevent inaccurate firing rates. Furthermore, we adopt a flow-based training method to fine-tune the converted models, reducing time steps while preserving performance. Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart, surpassing existing SNN methods. Compared with the converted Spiking-UNet without fine-tuning, our Spiking-UNet reduces inference time by approximately 90\%. This research broadens the application scope of SNNs in image processing and is expected to inspire further exploration in the field of neuromorphic engineering. The code for our Spiking-UNet implementation is available at https://github.com/SNNresearch/Spiking-UNet.

Deep Multi-Threshold Spiking-UNet for Image Processing

TL;DR

This work addresses the challenge of efficient pixel-wise image processing on neuromorphic hardware by introducing Spiking-UNet, a deep SNN aligned to a pre-trained U-Net. The approach combines multi-threshold spiking neurons, connection-wise weight normalization, and a flow-based fine-tuning pipeline to convert and optimize U-Nets into Spiking-UNets, significantly reducing inference time while maintaining competitive segmentation and denoising performance. Key contributions include a principled MT neuron design with optimal thresholding, a normalization scheme tailored for skip connections, and an accumulated-spiking-flow training method that lowers time steps without sacrificing accuracy. The results demonstrate comparable to U-Net performance and substantial energy savings, highlighting the practical potential of Spiking-UNets for neuromorphic image processing and signaling a path toward further neuromorphic deployments and extensions to additional tasks.

Abstract

U-Net, known for its simple yet efficient architecture, is widely utilized for image processing tasks and is particularly suitable for deployment on neuromorphic chips. This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture. To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy. To address the issue of information loss, we introduce multi-threshold spiking neurons, which improve the efficiency of information transmission within the Spiking-UNet. For the training strategy, we adopt a conversion and fine-tuning pipeline that leverage pre-trained U-Net models. During the conversion process, significant variability in data distribution across different parts is observed when utilizing skip connections. Therefore, we propose a connection-wise normalization method to prevent inaccurate firing rates. Furthermore, we adopt a flow-based training method to fine-tune the converted models, reducing time steps while preserving performance. Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart, surpassing existing SNN methods. Compared with the converted Spiking-UNet without fine-tuning, our Spiking-UNet reduces inference time by approximately 90\%. This research broadens the application scope of SNNs in image processing and is expected to inspire further exploration in the field of neuromorphic engineering. The code for our Spiking-UNet implementation is available at https://github.com/SNNresearch/Spiking-UNet.
Paper Structure (31 sections, 2 theorems, 10 equations, 5 figures, 7 tables, 1 algorithm)

This paper contains 31 sections, 2 theorems, 10 equations, 5 figures, 7 tables, 1 algorithm.

Key Result

Theorem 3.1

Consider a multi-threshold neuron model with a membrane voltage $V$ that follows a uniform distribution in the range $[0, 1]$ ($V \sim U[0, 1]$). Let $N$ thresholds be denoted as $V_{th,1}, V_{th,2}, ..., V_{th,N}$, subject to the constraints $V_{th,1} > V_{th,2} > ... > V_{th,N-1} > V_{th,N}$ and $

Figures (5)

  • Figure 1: (a) A single-threshold spiking neuron with only one threshold, is poor in information representation. (b) A multi-threshold spiking neuron with multiple thresholds, has enhanced information transmission ability. (c) A multi-threshold spiking neuron in convolution in SNN. (d) The Spiking-UNet architecture for the image segmentation task, basically follows the classic U-Net architecture. The information flow in Spiking-UNet is discrete spikes instead of continuous values. Average pooling is utilized to replace max pooling for convenient conversion.
  • Figure 2: Illustration of different weight normalization methods. (a) Layer-wise Normalization (b) Connection-wise Normalization. Layer-wise normalization utilizes only one ratio of the maximum input activation to the maximum output activation, applying for all weights. Connection-wise normalization considers the normalization for the concatenation operation, which computes the ratio of the maximum activation of the input part to the maximum output activation, applying for the corresponding weights.
  • Figure 3: Illustration of activation distribution of U-Net and correlation between activation and spiking rate for layer-wise and connection-wise normalization. (a - d) are the 11th, 13th, 15th and 17th layers after skip connections, respectively.
  • Figure 4: Qualitative segmentation results on images from the DRIVE, EM, and CamSeq01 datasets. (a) Input. (b) Ground Truth (GT). Segmentation results with (c) U-Net. (d) Spiking-FCN. (e, f) Multi-Level (ML) spiking neuron with direct training and fine-tuning, respectively. (g, h) Real Spike (RS) spiking neuron with direct training and fine-tuning, respectively. (i, j) Multi-Threshold (MT) spiking neuron with direct training and fine-tuning, respectively. 'DT' and 'FT' represent direct training and fine-tuning, respectively.
  • Figure 5: Qualitative denoising results on images from the BSD68, and CBSD68 datasets. (a) Input (b) Ground Truth (GT). Denoising results with (c) U-Net. (d, e) Multi-Level (ML) spiking neuron with direct training and fine-tuning, respectively. (f, g) Real Spike (RS) spiking neuron with direct training and fine-tuning, respectively. (h, i) Multi-Threshold (MT) spiking neuron with direct training and fine-tuning, respectively.

Theorems & Definitions (3)

  • Theorem 3.1
  • Theorem 1
  • Proof 1