Harden Deep Neural Networks Against Fault Injections Through Weight Scaling

Ninnart Fuengfusin; Hakaru Tamukoh

Harden Deep Neural Networks Against Fault Injections Through Weight Scaling

Ninnart Fuengfusin, Hakaru Tamukoh

TL;DR

The paper addresses the vulnerability of deep neural networks to fault injections on hardware by bit-flips in weights, which can degrade performance. It introduces a low-overhead defense based on layer-wise weight scaling: multiply weights by constants $c_i$ before storage and divide by the same constants on read to reduce the impact of bit-flips, with an optional logits-level division to lower the number of divisions. The authors provide a theoretical analysis of scaling and rescaling effects, derive practical guidelines for selecting $c_i$ (using $t$-based scaling with $c_i = t/\max(|W_i|)$ and $t$ values specific to data types), and validate the approach across FP32, FP16, and Q2.5 on ImageNet models, showing substantial robustness gains under fault injections. The work demonstrates a low-overhead, broadly applicable strategy for fault-tolerant DNN deployment, with an additional technique to further reduce computation by shifting divisions to the output logits when feasible.

Abstract

Deep neural networks (DNNs) have enabled smart applications on hardware devices. However, these hardware devices are vulnerable to unintended faults caused by aging, temperature variance, and write errors. These faults can cause bit-flips in DNN weights and significantly degrade the performance of DNNs. Thus, protection against these faults is crucial for the deployment of DNNs in critical applications. Previous works have proposed error correction codes based methods, however these methods often require high overheads in both memory and computation. In this paper, we propose a simple yet effective method to harden DNN weights by multiplying weights by constants before storing them to fault-prone medium. When used, these weights are divided back by the same constants to restore the original scale. Our method is based on the observation that errors from bit-flips have properties similar to additive noise, therefore by dividing by constants can reduce the absolute error from bit-flips. To demonstrate our method, we conduct experiments across four ImageNet 2012 pre-trained models along with three different data types: 32-bit floating point, 16-bit floating point, and 8-bit fixed point. This method demonstrates that by only multiplying weights with constants, Top-1 Accuracy of 8-bit fixed point ResNet50 is improved by 54.418 at bit-error rate of 0.0001.

Harden Deep Neural Networks Against Fault Injections Through Weight Scaling

TL;DR

Abstract

Harden Deep Neural Networks Against Fault Injections Through Weight Scaling

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)