Table of Contents
Fetching ...

SNAP-V: A RISC-V SoC with Configurable Neuromorphic Acceleration for Small-Scale Spiking Neural Networks

Kanishka Gunawardana, Sanka Peeris, Kavishka Rambukwella, Thamish Wanduragala, Saadia Jameel, Roshan Ragel, Isuru Nawinne

Abstract

Spiking Neural Networks (SNNs) have gained significant attention in edge computing due to their low power consumption and computational efficiency. However, existing implementations either use conventional System on Chip (SoC) architectures that suffer from memory-processor bottlenecks, or large-scale neuromorphic hardware that is inefficient and wasteful for small-scale SNN applications. This work presents SNAP-V, a RISC-V-based neuromorphic SoC with two accelerator variants: Cerebra-S (bus-based) and Cerebra-H (Network-on-Chip (NoC)-based) which are optimized for small-scale SNN inference, integrating a RISC-V core for management tasks, with both accelerators featuring parallel processing nodes and distributed memory. Experimental results show close agreement between software and hardware inference, with an average accuracy deviation of 2.62% across multiple network configurations, and an average synaptic energy of 1.05 pJ per synaptic operation (SOP) in 45 nm CMOS technology. These results show that the proposed solution enables accurate, energy-efficient SNN inference suitable for real-time edge applications.

SNAP-V: A RISC-V SoC with Configurable Neuromorphic Acceleration for Small-Scale Spiking Neural Networks

Abstract

Spiking Neural Networks (SNNs) have gained significant attention in edge computing due to their low power consumption and computational efficiency. However, existing implementations either use conventional System on Chip (SoC) architectures that suffer from memory-processor bottlenecks, or large-scale neuromorphic hardware that is inefficient and wasteful for small-scale SNN applications. This work presents SNAP-V, a RISC-V-based neuromorphic SoC with two accelerator variants: Cerebra-S (bus-based) and Cerebra-H (Network-on-Chip (NoC)-based) which are optimized for small-scale SNN inference, integrating a RISC-V core for management tasks, with both accelerators featuring parallel processing nodes and distributed memory. Experimental results show close agreement between software and hardware inference, with an average accuracy deviation of 2.62% across multiple network configurations, and an average synaptic energy of 1.05 pJ per synaptic operation (SOP) in 45 nm CMOS technology. These results show that the proposed solution enables accurate, energy-efficient SNN inference suitable for real-time edge applications.
Paper Structure (30 sections, 5 figures, 5 tables)

This paper contains 30 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: High Level Overview of SNAP-V SOC. The architecture integrates a MainCore for general-purpose tasks and a SpikeCore for accelerator orchestration via the RoCC interface. The neuromorphic subsystem (Cerebra-H) communicates with the SoC through a dedicated Blackbox Module containing encoding/decoding hardware and an accelerator controller.
  • Figure 2: Accelerator Design of Cerebra-S. The architecture consists of a tiled array of 1024 physical neurons connected to a shared neuron interconnect via a global tagged bus. Each individual neuron tile contains an accumulator unit for synaptic integration, a potential decay unit, and a potential adder unit for threshold evaluation and spike generation.
  • Figure 3: Accelerator Design of Cerebra-H. The clustered neuromorphic architecture employs a hierarchical Network-on-Chip (NoC) topology. Lower-layer routers (L1) connect groups of four neuron clusters (NC0-NC31), which are further aggregated by a central upper-layer router (L2) to facilitate parallel spike communication and reduce global routing overhead.
  • Figure 4: Neuron Microarchitecture. The internal datapath of a single configurable Leaky Integrate-and-Fire (LIF) neuron. It includes a finite-state machine control logic block, an accumulator unit for integrating incoming 32-bit synaptic weights, a potential decay unit utilizing arithmetic right-shifts, and a potential adder unit that evaluates membrane thresholds to generate spike outputs.
  • Figure 5: Functional Block Diagram of a Cluster. The internal organization of a single 32-neuron cluster illustrating the dual communication paths. The cluster controller manages configuration packets, while the incoming forwarder and outgoing encoder handle the routing and serialization of 11-bit spike packets to and from the local neuron bank.