Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training

Akul Malhotra; Sumeet Kumar Gupta

Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training

Akul Malhotra, Sumeet Kumar Gupta

TL;DR

The paper analyzes how memory faults affect activation-sparse quantized DNNs (AS QDNNs), showing that increased activation sparsity sharpens the loss landscape and reduces fault tolerance, leading to up to 11.13% lower accuracy under faults compared to standard QDNNs. It demonstrates that ASQ training, which flattens the weight-loss landscape by optimizing a min-max objective with adversarial perturbations, improves fault tolerance for both AS and standard QDNNs, achieving up to 19.50% and 15.82% accuracy gains respectively over conventionally trained baselines. Notably, SAQ-trained AS QDNNs can surpass conventionally trained standard QDNNs in faulty settings, enabling low-latency, fault-tolerant activation-sparse quantized models. The work highlights the practical potential of combining activation sparsity with SAQ to realize edge-efficient DNNs without sacrificing reliability under hardware faults.

Abstract

Improving the hardware efficiency of deep neural network (DNN) accelerators with techniques such as quantization and sparsity enhancement have shown an immense promise. However, their inference accuracy in non-ideal real-world settings (such as in the presence of hardware faults) is yet to be systematically analyzed. In this work, we investigate the impact of memory faults on activation-sparse quantized DNNs (AS QDNNs). We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults, with AS QDNNs exhibiting up to 11.13% lower accuracy than the standard QDNNs. We establish that the degraded accuracy correlates with a sharper minima in the loss landscape for AS QDNNs, which makes them more sensitive to perturbations in the weight values due to faults. Based on this observation, we employ sharpness-aware quantization (SAQ) training to mitigate the impact of memory faults. The AS and standard QDNNs trained with SAQ have up to 19.50% and 15.82% higher inference accuracy, respectively compared to their conventionally trained equivalents. Moreover, we show that SAQ-trained AS QDNNs show higher accuracy in faulty settings than standard QDNNs trained conventionally. Thus, sharpness-aware training can be instrumental in achieving sparsity-related latency benefits without compromising on fault tolerance.

Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training

TL;DR

Abstract

Paper Structure (15 sections, 3 equations, 5 figures)

This paper contains 15 sections, 3 equations, 5 figures.

Introduction
Background and Related Work
Activation sparsity in DNNs
Memory faults in DNN accelerators
Sharpness-Aware Quantization (SAQ)
Impact of Faults on Activation-Sparse QDNNs
Experimental Framework
Results
Latency benefits of Enhanced Activation Sparsity
Impact of faults on Inference Accuracy
Weight Loss Landscape visualization
Mitigation Strategy: SAQ
SAQ-based fault mitigation strategy
Results
Conclusion

Figures (5)

Figure 1: (a) Shows an activation-sparse QDNN (AS QDNN) with memory faults. We show that AS-QDNNs suffer larger accuracy degradation due to faults than their standard counterparts. (b) describes the trade-off between the latency benefits due to enhanced activation sparsity and reduced fault tolerance. To overcome this, we utilize sharpness-aware quantization (SAQ) training, which enhances the AS QDNNs fault tolerance by flattening its weight loss landscape.
Figure 2: The activation sparsities of (a) LeNet 5 and (b) ResNet 18 standard QDNNs and activation sparse (AS) QDNNs in both fault-free and faulty settings. The activation sparsity of LeNet 5 and ResNet 18 AS QDNNs is 95.34% and 41.51% higher than their standard counterparts, and is sustained in faulty environments.
Figure 3: Latencies of various QDNNs deployed on DNN accelerators
Figure 4: (a) The weight loss landscape of the standard and AS LeNet-5 QDNNs visualized using the technique in visualize. x and y are normalized random directions. (b) and (c) show the impact of a fault (F) on the loss value in a landscape with (b) sharp and (c) flat minima. The fault causes a larger change in the loss value in the former.
Figure 5: Comparison of the impact on classification accuracy for different fault scenarios for both SAQ trained and conventionally trained standard and activation-sparse (AS) QDNNs. It can be seen that AS QDNNs have lesser fault tolerance than their standard counterparts. Also, SAQ-trained QDNNs display higher fault tolerance than their conventionally trained equivalents.

Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training

TL;DR

Abstract

Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training

Authors

TL;DR

Abstract

Table of Contents

Figures (5)