Table of Contents
Fetching ...

DeepNcode: Encoding-Based Protection against Bit-Flip Attacks on Neural Networks

Patrik Velčický, Jakub Breier, Mladen Kovačević, Xiaolu Hou

TL;DR

This work tackles the vulnerability of quantized neural networks to fault-injection bit-flip attacks by introducing DeepNcode, an encoding-based defense that assigns weight values to codewords from carefully chosen binary codes. By exploiting the properties of Hamming and extended-Hamming codes, including shortenings, DeepNcode creates large distances between codewords corresponding to values that differ by a single bit, significantly increasing the number of flips needed for an attacker to alter weights. The authors demonstrate substantial protection gains across 4-bit and 8-bit quantized networks (with no retraining and preserved accuracy) while analyzing memory and time overheads and proposing a detection-augmented deployment mode. The approach provides provable security margins tied to code parameters and offers a practical, hardware-agnostic defense suitable for embedded neural-network deployments facing Rowhammer-like fault attacks.

Abstract

Fault injection attacks are a potent threat against embedded implementations of neural network models. Several attack vectors have been proposed, such as misclassification, model extraction, and trojan/backdoor planting. Most of these attacks work by flipping bits in the memory where quantized model parameters are stored. In this paper, we introduce an encoding-based protection method against bit-flip attacks on neural networks, titled DeepNcode. We experimentally evaluate our proposal with several publicly available models and datasets, by using state-of-the-art bit-flip attacks: BFA, T-BFA, and TA-LBF. Our results show an increase in protection margin of up to $7.6\times$ for $4-$bit and $12.4\times$ for $8-$bit quantized networks. Memory overheads start at $50\%$ of the original network size, while the time overheads are negligible. Moreover, DeepNcode does not require retraining and does not change the original accuracy of the model.

DeepNcode: Encoding-Based Protection against Bit-Flip Attacks on Neural Networks

TL;DR

This work tackles the vulnerability of quantized neural networks to fault-injection bit-flip attacks by introducing DeepNcode, an encoding-based defense that assigns weight values to codewords from carefully chosen binary codes. By exploiting the properties of Hamming and extended-Hamming codes, including shortenings, DeepNcode creates large distances between codewords corresponding to values that differ by a single bit, significantly increasing the number of flips needed for an attacker to alter weights. The authors demonstrate substantial protection gains across 4-bit and 8-bit quantized networks (with no retraining and preserved accuracy) while analyzing memory and time overheads and proposing a detection-augmented deployment mode. The approach provides provable security margins tied to code parameters and offers a practical, hardware-agnostic defense suitable for embedded neural-network deployments facing Rowhammer-like fault attacks.

Abstract

Fault injection attacks are a potent threat against embedded implementations of neural network models. Several attack vectors have been proposed, such as misclassification, model extraction, and trojan/backdoor planting. Most of these attacks work by flipping bits in the memory where quantized model parameters are stored. In this paper, we introduce an encoding-based protection method against bit-flip attacks on neural networks, titled DeepNcode. We experimentally evaluate our proposal with several publicly available models and datasets, by using state-of-the-art bit-flip attacks: BFA, T-BFA, and TA-LBF. Our results show an increase in protection margin of up to for bit and for bit quantized networks. Memory overheads start at of the original network size, while the time overheads are negligible. Moreover, DeepNcode does not require retraining and does not change the original accuracy of the model.
Paper Structure (22 sections, 14 equations, 6 figures, 13 tables)

This paper contains 22 sections, 14 equations, 6 figures, 13 tables.

Figures (6)

  • Figure 1: Summary of the improvement in the number of bit flips required for the successful attack with DeepNcode over the unprotected implementation for 4-bit and 8-bit quantized networks. The data is the average of the minimum number of bit flips over all the networks and datasets analyzed in this work. Detailed results can be found in tables \ref{['tab:4-bit-result']} and \ref{['tab:8-bit-results']}.
  • Figure 2: High-level overview of DeepNcode. The attacker's goal is to change the original weight value (e.g., $5$) to another value (e.g., $-3$). To do that, a Rowhammer can be used to flip bits in memory. In this particular example, while in the traditional two's complement representation, this can be done by a single bit flip, with the DeepNcode protection, all the $7$ bits need to be flipped, thus drastically increasing the attacker's effort.
  • Figure 3: Illustration of a bit-flip attack on neural networks by utilizing the Rowhammer technique. The attacker flips a bit in the DDR memory by rapidly writing into the aggressor row. This changes the weight value stored in the victim row. During the inference phase, this alters the output value of the target neuron, and if enough neurons are hammered, the output class is changed.
  • Figure 4: $7,500$ different attacks based on three recent bit flip attacks on quantized neural networks were carried out. Those attacks result in $\textcolor{black}{147,106}$ weight value changes for $4-$bit quantized neural networks and $\textcolor{black}{158,310}$ weight value changes for $8-$bit quantized neural networks. The figures illustrate the percentage of the number of bits attacked per weight value. Note that numbers for 4 bits are too small to show on the chart ($\approx 0.01\%$ for 4-bit quantization and $\approx 0.003\%$ for 8-bit quantization), therefore they were merged with 3 bits.
  • Figure 5: $3,800$ different attacks based on three recent bit flip attacks on $4$-bit quantized neural networks were carried out. Those attacks result in $\textcolor{black}{147,106}$ weight value changes. The heat map summarizes how often a weight value (rows) is changed to another value (columns) in those attacks.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Definition 1: ling2004coding
  • Definition 2: ling2004coding