Table of Contents
Fetching ...

No Data, No Optimization: A Lightweight Method To Disrupt Neural Networks With Sign-Flips

Ido Galil, Moshe Kimhi, Ran El-Yaniv

TL;DR

This work uncovers a data-free vulnerability in deep neural networks where flipping a small number of sign bits in parameters can cause drastic accuracy loss. It introduces Deep Neural Lesion (DNL), a lightweight, pass-free attack that identifies and flips critical parameters, and an enhanced 1P-DNL variant that uses a single forward/backward pass to increase impact. The authors demonstrate broad effectiveness across architectures and datasets, including ImageNet-scale models, and propose a practical defense by selectively protecting the most vulnerable sign bits. The study highlights significant security implications for deployed DNNs and motivates defenses at the parameter and hardware levels to mitigate sign-bit attacks.

Abstract

Deep Neural Networks (DNNs) can be catastrophically disrupted by flipping only a handful of sign bits in their parameters. We introduce Deep Neural Lesion (DNL), a data-free, lightweight method that locates these critical parameters and triggers massive accuracy drops. We validate its efficacy on a wide variety of computer vision models and datasets. The method requires no training data or optimization and can be carried out via common exploits software, firmware or hardware based attack vectors. An enhanced variant that uses a single forward and backward pass further amplifies the damage beyond DNL's zero-pass approach. Flipping just two sign bits in ResNet50 on ImageNet reduces accuracy by 99.8\%. We also show that selectively protecting a small fraction of vulnerable sign bits provides a practical defense against such attacks.

No Data, No Optimization: A Lightweight Method To Disrupt Neural Networks With Sign-Flips

TL;DR

This work uncovers a data-free vulnerability in deep neural networks where flipping a small number of sign bits in parameters can cause drastic accuracy loss. It introduces Deep Neural Lesion (DNL), a lightweight, pass-free attack that identifies and flips critical parameters, and an enhanced 1P-DNL variant that uses a single forward/backward pass to increase impact. The authors demonstrate broad effectiveness across architectures and datasets, including ImageNet-scale models, and propose a practical defense by selectively protecting the most vulnerable sign bits. The study highlights significant security implications for deployed DNNs and motivates defenses at the parameter and hardware levels to mitigate sign-bit attacks.

Abstract

Deep Neural Networks (DNNs) can be catastrophically disrupted by flipping only a handful of sign bits in their parameters. We introduce Deep Neural Lesion (DNL), a data-free, lightweight method that locates these critical parameters and triggers massive accuracy drops. We validate its efficacy on a wide variety of computer vision models and datasets. The method requires no training data or optimization and can be carried out via common exploits software, firmware or hardware based attack vectors. An enhanced variant that uses a single forward and backward pass further amplifies the damage beyond DNL's zero-pass approach. Flipping just two sign bits in ResNet50 on ImageNet reduces accuracy by 99.8\%. We also show that selectively protecting a small fraction of vulnerable sign bits provides a practical defense against such attacks.

Paper Structure

This paper contains 15 sections, 6 equations, 16 figures, 3 tables, 2 algorithms.

Figures (16)

  • Figure 1: Impact of randomly flipping sign bits on model performance. The plot shows the distribution of $AR(\cdot)$ values across 48 Imagenet models when up to 100,000 sign bits are flipped at random.
  • Figure 2: Comparison of $mAR_{10}$ across different strategies applied to 48 ImageNet models. Magnitude-based sign-flips consistently exhibit fatal reductions in model accuracy, outperforming random flips. The proposed methods, DNL and 1P-DNL, demonstrate even greater effectiveness by targeting critical parameters, achieving significant accuracy degradation with minimal computational overhead.
  • Figure 3: Comparing the $AR(10)$ under different strategies across 48 ImageNet models. The figure highlights the superior performance of 1P-DNL in causing substantial accuracy drops with up to 10 sign flips.
  • Figure 4: Horizontal edge detection filter (based on the Sobel Y filter) with one or two sign flips and their corresponding extracted features. With a single sign flip, the filter is severely disrupted, rendering it unable to detect edges effectively. However, with two bit flips, the resulting errors may partially offset each other, allowing the filter to retain some edge-detection capability and produce features similar to the original.
  • Figure 5: Targeting only the first $l$ layers, x-axis report $mAP(10)$.
  • ...and 11 more figures