SpikeFI: A Fault Injection Framework for Spiking Neural Networks
Theofilos Spyrou, Said Hamdioui, Haralampos-G. Stratigopoulos
TL;DR
SpikeFI addresses the reliability of spiking neural networks deployed on neuromorphic hardware by providing a GPU-accelerated fault-injection framework that maps hardware faults to behavioral SRM models. Built on SLAYER/PyTorch, SpikeFI offers a comprehensive fault-model library, supports pre-/during-/post-training injection, multi-round campaigns, and speedups (late start, early stop, batched inference) with visualization tools. Its key contributions include open-source fault models for neurons and synapses (hard and parametric, permanent or transient), detailed FI campaign structuring, and demonstrated case studies on N-MNIST and DVS128 Gesture to guide fault-tolerant design and testing. The framework enables reliability analysis, test generation, and fault-aware training to improve robustness of neuromorphic systems in practical AI tasks.
Abstract
Neuromorphic computing and spiking neural networks (SNNs) are gaining traction across various artificial intelligence (AI) tasks thanks to their potential for efficient energy usage and faster computation speed. This comparative advantage comes from mimicking the structure, function, and efficiency of the biological brain, which arguably is the most brilliant and green computing machine. As SNNs are eventually deployed on a hardware processor, the reliability of the application in light of hardware-level faults becomes a concern, especially for safety- and mission-critical applications. In this work, we propose SpikeFI, a fault injection framework for SNNs that can be used for automating the reliability analysis and test generation. SpikeFI is built upon the SLAYER PyTorch framework with fault injection experiments accelerated on a single or multiple GPUs. It has a comprehensive integrated neuron and synapse fault model library, in accordance to the literature in the domain, which is extendable by the user if needed. It supports: single and multiple faults; permanent and transient faults; specified, random layer-wise, and random network-wise fault locations; and pre-, during, and post-training fault injection. It also offers several optimization speedups and built-in functions for results visualization. SpikeFI is open-source and available for download via GitHub at https://github.com/SpikeFI.
