Table of Contents
Fetching ...

Exploration of Activation Fault Reliability in Quantized Systolic Array-Based DNN Accelerators

Mahdi Taheri, Natalia Cherezova, Mohammad Saeed Ansari, Maksim Jenihhin, Ali Mahani, Masoud Daneshtalab, Jaan Raik

TL;DR

This work targets reliable deployment of quantized DNN accelerators on FPGA-based systolic arrays by investigating how activation faults interact with different quantization levels. It introduces an automated DeepXcel framework that combines quantization-aware training, post-training quantization, range-restricted inference, fault injection, and hardware generation to assess accuracy, activation fault reliability, and hardware metrics. A lightweight range-check protection technique is integrated to mitigate activation faults with modest hardware overhead. Empirical evaluation on Lenet-5 and AlexNet (MNIST and CIFAR-10) demonstrates that quantization can increase vulnerability in activations, yet the proposed protection substantially improves reliability (up to ~34% for Lenet-5 and ~52% for AlexNet) while keeping hardware overhead under 10% of LUT resources, illustrating meaningful design-space guidance for safe, efficient quantized DNN accelerators.

Abstract

The stringent requirements for the Deep Neural Networks (DNNs) accelerator's reliability stand along with the need for reducing the computational burden on the hardware platforms, i.e. reducing the energy consumption and execution time as well as increasing the efficiency of DNN accelerators. Moreover, the growing demand for specialized DNN accelerators with tailored requirements, particularly for safety-critical applications, necessitates a comprehensive design space exploration to enable the development of efficient and robust accelerators that meet those requirements. Therefore, the trade-off between hardware performance, i.e. area and delay, and the reliability of the DNN accelerator implementation becomes critical and requires tools for analysis. This paper presents a comprehensive methodology for exploring and enabling a holistic assessment of the trilateral impact of quantization on model accuracy, activation fault reliability, and hardware efficiency. A fully automated framework is introduced that is capable of applying various quantization-aware techniques, fault injection, and hardware implementation, thus enabling the measurement of hardware parameters. Moreover, this paper proposes a novel lightweight protection technique integrated within the framework to ensure the dependable deployment of the final systolic-array-based FPGA implementation. The experiments on established benchmarks demonstrate the analysis flow and the profound implications of quantization on reliability, hardware performance, and network accuracy, particularly concerning the transient faults in the network's activations.

Exploration of Activation Fault Reliability in Quantized Systolic Array-Based DNN Accelerators

TL;DR

This work targets reliable deployment of quantized DNN accelerators on FPGA-based systolic arrays by investigating how activation faults interact with different quantization levels. It introduces an automated DeepXcel framework that combines quantization-aware training, post-training quantization, range-restricted inference, fault injection, and hardware generation to assess accuracy, activation fault reliability, and hardware metrics. A lightweight range-check protection technique is integrated to mitigate activation faults with modest hardware overhead. Empirical evaluation on Lenet-5 and AlexNet (MNIST and CIFAR-10) demonstrates that quantization can increase vulnerability in activations, yet the proposed protection substantially improves reliability (up to ~34% for Lenet-5 and ~52% for AlexNet) while keeping hardware overhead under 10% of LUT resources, illustrating meaningful design-space guidance for safe, efficient quantized DNN accelerators.

Abstract

The stringent requirements for the Deep Neural Networks (DNNs) accelerator's reliability stand along with the need for reducing the computational burden on the hardware platforms, i.e. reducing the energy consumption and execution time as well as increasing the efficiency of DNN accelerators. Moreover, the growing demand for specialized DNN accelerators with tailored requirements, particularly for safety-critical applications, necessitates a comprehensive design space exploration to enable the development of efficient and robust accelerators that meet those requirements. Therefore, the trade-off between hardware performance, i.e. area and delay, and the reliability of the DNN accelerator implementation becomes critical and requires tools for analysis. This paper presents a comprehensive methodology for exploring and enabling a holistic assessment of the trilateral impact of quantization on model accuracy, activation fault reliability, and hardware efficiency. A fully automated framework is introduced that is capable of applying various quantization-aware techniques, fault injection, and hardware implementation, thus enabling the measurement of hardware parameters. Moreover, this paper proposes a novel lightweight protection technique integrated within the framework to ensure the dependable deployment of the final systolic-array-based FPGA implementation. The experiments on established benchmarks demonstrate the analysis flow and the profound implications of quantization on reliability, hardware performance, and network accuracy, particularly concerning the transient faults in the network's activations.
Paper Structure (12 sections, 3 equations, 6 figures, 4 tables)

This paper contains 12 sections, 3 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Hardware-induced reliability threats in an example DNN accelerator and their possible impact on the output
  • Figure 2: Proposed methodology flow
  • Figure 3: Proposed lightweight mitigation technique
  • Figure 4: Lenet-5 layer-level reports of reliability drop (based on FI for different quantized networks)
  • Figure 5: AlexNet layer-level reports of reliability drop (%) based on different quantization levels (unprotected design)
  • ...and 1 more figures