Single-Event Upset Analysis of a Systolic Array based Deep Neural Network Accelerator
Naïn Jonckers, Toon Vinck, Gert Dekkers, Peter Karsmakers, Jeffrey Prinzie
TL;DR
This work analyzes the reliability of a systolic-array based DNN accelerator under Single-Event Upsets (SEUs) by performing cycle-accurate RTL fault-injection across multiple SA sizes. The authors inject a single SEU into random flip-flops and study propagation through the SA core and post-processing, using constrained random inputs to mimic DNN workloads. They identify that 32-bit register groups incur the highest fault magnitude and house the most flip-flops, making them prime candidates for hardening, while 8-bit groups and ReLU-induced masking offer notable resilience. The findings provide architecture-agnostic guidance for reliability improvements in AI accelerators, with recommendations aligned to practical fault-hardening strategies and future validation with real workloads.
Abstract
Deep Neural Network (DNN) accelerators are extensively used to improve the computational efficiency of DNNs, but are prone to faults through Single-Event Upsets (SEUs). In this work, we present an in-depth analysis of the impact of SEUs on a Systolic Array (SA) based DNN accelerator. A fault injection campaign is performed through a Register-Transfer Level (RTL) based simulation environment to improve the observability of each hardware block, including the SA itself as well as the post-processing pipeline. From this analysis, we present the sensitivity, independent of a DNN model architecture, for various flip-flop groups both in terms of fault propagation probability and fault magnitude. This allows us to draw detailed conclusions and determine optimal mitigation strategies.
