Table of Contents
Fetching ...

Topology Optimization of Random Memristors for Input-Aware Dynamic SNN

Bo Wang, Shaocong Wang, Ning Lin, Yi Li, Yifei Yu, Yue Zhang, Jichang Yang, Xiaoshan Wu, Yangu He, Songqi Wang, Rui Chen, Guoqi Li, Xiaojuan Qi, Zhongrui Wang, Dashan Shang

TL;DR

This work introduces pruning optimization for input-aware dynamic memristive spiking neural network (PRIME), which uses spiking neurons to emulate brain’s spiking mechanisms and optimizes the topology of random memristive SNNs inspired by structural plasticity, effectively mitigating memristor programming stochasticity.

Abstract

There is unprecedented development in machine learning, exemplified by recent large language models and world simulators, which are artificial neural networks running on digital computers. However, they still cannot parallel human brains in terms of energy efficiency and the streamlined adaptability to inputs of different difficulties, due to differences in signal representation, optimization, run-time reconfigurability, and hardware architecture. To address these fundamental challenges, we introduce pruning optimization for input-aware dynamic memristive spiking neural network (PRIME). Signal representation-wise, PRIME employs leaky integrate-and-fire neurons to emulate the brain's inherent spiking mechanism. Drawing inspiration from the brain's structural plasticity, PRIME optimizes the topology of a random memristive spiking neural network without expensive memristor conductance fine-tuning. For runtime reconfigurability, inspired by the brain's dynamic adjustment of computational depth, PRIME employs an input-aware dynamic early stop policy to minimize latency during inference, thereby boosting energy efficiency without compromising performance. Architecture-wise, PRIME leverages memristive in-memory computing, mirroring the brain and mitigating the von Neumann bottleneck. We validated our system using a 40 nm 256 Kb memristor-based in-memory computing macro on neuromorphic image classification and image inpainting. Our results demonstrate the classification accuracy and Inception Score are comparable to the software baseline, while achieving maximal 62.50-fold improvements in energy efficiency, and maximal 77.0% computational load savings. The system also exhibits robustness against stochastic synaptic noise of analogue memristors. Our software-hardware co-designed model paves the way to future brain-inspired neuromorphic computing with brain-like energy efficiency and adaptivity.

Topology Optimization of Random Memristors for Input-Aware Dynamic SNN

TL;DR

This work introduces pruning optimization for input-aware dynamic memristive spiking neural network (PRIME), which uses spiking neurons to emulate brain’s spiking mechanisms and optimizes the topology of random memristive SNNs inspired by structural plasticity, effectively mitigating memristor programming stochasticity.

Abstract

There is unprecedented development in machine learning, exemplified by recent large language models and world simulators, which are artificial neural networks running on digital computers. However, they still cannot parallel human brains in terms of energy efficiency and the streamlined adaptability to inputs of different difficulties, due to differences in signal representation, optimization, run-time reconfigurability, and hardware architecture. To address these fundamental challenges, we introduce pruning optimization for input-aware dynamic memristive spiking neural network (PRIME). Signal representation-wise, PRIME employs leaky integrate-and-fire neurons to emulate the brain's inherent spiking mechanism. Drawing inspiration from the brain's structural plasticity, PRIME optimizes the topology of a random memristive spiking neural network without expensive memristor conductance fine-tuning. For runtime reconfigurability, inspired by the brain's dynamic adjustment of computational depth, PRIME employs an input-aware dynamic early stop policy to minimize latency during inference, thereby boosting energy efficiency without compromising performance. Architecture-wise, PRIME leverages memristive in-memory computing, mirroring the brain and mitigating the von Neumann bottleneck. We validated our system using a 40 nm 256 Kb memristor-based in-memory computing macro on neuromorphic image classification and image inpainting. Our results demonstrate the classification accuracy and Inception Score are comparable to the software baseline, while achieving maximal 62.50-fold improvements in energy efficiency, and maximal 77.0% computational load savings. The system also exhibits robustness against stochastic synaptic noise of analogue memristors. Our software-hardware co-designed model paves the way to future brain-inspired neuromorphic computing with brain-like energy efficiency and adaptivity.
Paper Structure (25 sections, 7 equations, 5 figures)

This paper contains 25 sections, 7 equations, 5 figures.

Figures (5)

  • Figure 1: Brain-inspired topology optimization for input-aware dynamic SNN on memristors.a, Comparison of the information representation in human brain, the artificial neuron model of conventional static ANNs, and the spiking neuron model of PRIME. Both the human brain and PRIME encode information with spikes, whereas ANNs do not. b, Comparison of the optimization scheme in human brain with structural plasticity, weight-trained neural network, and topology-optimized neural network. The human brain and PRIME optimize network topology instead of relying on fine-tuning of synaptic weights, as seen in conventional ANNs. c, Comparison of run-time reconfigurability in human brain with dynamic computational depth, conventional static ANNs, and PRIME. The human brain and PRIME feature dynamic computational depth and dynamically adapt to new stimuli for reducing computational costs. In contrast, conventional ANNs are of fixed computational depth that is constant to inputs of different difficulties. d, Comparison of the hardware architecture in human brain, the digital hardware implementing conventional ANNs, and memristive neuromorphic system on PRIME. The human brain and PRIME utilize in-memory computing, which collocates processing and memory in biological synapses and memristors, respectively, thereby enhancing energy efficiency. In contrast, digital computers based on the Von Neumann architecture, separate storage and computing.
  • Figure 2: Overview of PRIME.a, The brain-inspired topology optimization of randomly initialized SNN (left). Initially, an overparameterized SNN with random memristor connections is generated using inherit programming stochasticity. Each random synaptic weight is then assigned a score $s$, reflecting its importance. Synapses with the top $k\%$ scores are retained, while others are pruned to form a subnet. The loss is calculated using this subnet and backpropagated through the entire network to optimize the scores. This process is iterated until convergence, yielding the optimal subnet. Further details are provided in Methods. The brain-inspired input-aware dynamic SNN in inference (right). In inference, time-wise confidence is calculated, either as a softmax score or a consistency score, depending on the task. If the confidence meets the threshold policy, the inference terminates. Further details are provided in Methods. b, The Schematic of Memristor-based SNN before (left) and after (right) topology optimization, consisting of memristor crossbar arrays and leaky integrate-and-fire neurons. Before pruning, the differential memristor pairs in the randomly overparameterized supernet ($\textbf{G}^+$ and $\textbf{G}^-$) follows a mixture of Gaussian distributions. After pruning, the redundant memristor pairs are RESET, resulting in a conductance peak around zero. c, Optical photo of the 40nm 256K memristor in-memory computing macro (left). Cross-sectional HAADF–STEM image of the memristor array (middle and right). d, Joint distribution of the mean conductance and standard deviation of 128 randomly selected resistive memory cells in 10,000 reinstating programming cycles (left). Joint distribution of the 128 resistive differential pairs before (middle) and after pruning (right).
  • Figure 3: Experimental image classification for N-MNIST dataset with PRIME.a, Illusration of the convolutional SNN of PRIME during the inference, showing pruned random memristor kernels and associated feature maps in the N-MNIST classification. The network outputs at each time step are used to compute confidence scores. b, The classification accuracy and dynamic latency (evaluated as the average timesteps of the test data) comparisons of hardware PRIME and software baseline at various early stop thresholds. c, tSNE visualizations of feature maps from PRIME at different early stop thresholds, coloured according to ground truth labels. d, Confusion matrices and classification accuracy of PRIME at different early stop thresholds. e, The classification accuracy for SNNs by different optimization methods. Weight Tuning: Optimize the weights of software SNN through STBP. Software Pruning: Optimize the topology of software SNN, with randomly initialized weights. Memristor Pruning: Optimize the topology of memristor-based SNN, where the random weights are produced by memristor programming stochasticity. Random Software and Memristor Weights: The SNNs are initialized with random weights, separately implemented in software and on memristors. f, Comparison of the inference energy of a single image on a projected hybrid analogue-digital system and digital hardware at different early stop thresholds. The former shows a significant energy reduction due to in-memory computing. g, Impact of memristor programming noise on accuracy between memristor-based SNNs optimized by different methods at various noise levels.
  • Figure 4: Experimental image inpainting of MNIST dataset with PRIME.a, The spiking VAE of PRIME showing example feature maps during MNIST image inpainting. The latency of spiking VAE is dynamically adapted by the image consistence confidence score (Methods). b, The reconstruction loss (Methods) and dynamic timestep (evaluated as the average timesteps of the test data) comparisons of hardware PRIME and software baseline at various early stop thresholds on MNIST (left). The raw, input, and reconstructed images at different thresholds (right). c, The reconstruction loss for SNNs optimized by different methods. Weight Tuning: Optimize the weights of software SNN through STBP. Software Pruning and Software Pruning ES 1.0: Optimize the topology of software SNN with randomly initialized weights. The dynamic early stop policy is either applied (former) or not applied (latter) in inference. Memristor Pruning and Memristor Pruning ES 1.0: Optimize the topology of memristor-based SNN, where the random weights are produced by memristor programming stochasticity. The dynamic early stop policy is either applied (former) or not applied (latter) in inference. d, Impact of memristor programming noise on SNNs optimized by different methods at various noise levels. e, The IS comparisons of SNNs by different optimization methods at various early stop thresholds. f, Comparison of the inference energy of a projected hybrid analogue-digital system and digital hardware at different early stop thresholds.
  • Figure 5: Memristor noise and impact on PRIME.a, The mechanism of memristor programming and read noise. The random ion motion in filament formation and dissolution leads to programming noise, while charge trapping/detrapping and thermal fluctuation yields read noise. b, The memristor programming noise represented as the heatmap and histgram, which follows a quasi-normal distribution for initializing random weights in PRIME. c, The memristor read noise of 15 randomly selected memristors with 10,000 read cycles, showing clear conductance temporal fluctuation that degrades model's performance. d, Noise robustness evaluation of PRIME at various early stop thresholds on N-MNIST classification. e, Noise robustness evaluation of PRIME at various early stop thresholds on MNIST image inpainting.