Table of Contents
Fetching ...

Properties that allow or prohibit transferability of adversarial attacks among quantized networks

Abhishek Shrestha, Jürgen Großmann

TL;DR

This study addresses the security implications of deploying quantized neural networks on embedded devices by examining how adversarial attacks transfer across different bitwidths. It systematically evaluates multiple attack algorithms (FGSM, JSMA, UAP, Boundary Attack, CW) on MNIST and CIFAR-10 with six quantization levels, including scenarios where source and target differ in capacity and architecture, using DoReFa-Net quantization and training-time STEs. The findings show that quantization generally reduces transferability, but certain attacks—especially UAP at higher distortion budgets—can transfer more effectively, while gradient-based attacks remain relatively brittle to quantization shifts. A key practical takeaway is that the average transferability observed between quantized variants can serve as a proxy for transferability to unseen target models with different capacity or architecture, informing robustness assessments for edge-device deployments.

Abstract

Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples. Further, these adversarial examples are found to be transferable from the source network in which they are crafted to a black-box target network. As the trend of using deep learning on embedded devices grows, it becomes relevant to study the transferability properties of adversarial examples among compressed networks. In this paper, we consider quantization as a network compression technique and evaluate the performance of transfer-based attacks when the source and target networks are quantized at different bitwidths. We explore how algorithm specific properties affect transferability by considering various adversarial example generation algorithms. Furthermore, we examine transferability in a more realistic scenario where the source and target networks may differ in bitwidth and other model-related properties like capacity and architecture. We find that although quantization reduces transferability, certain attack types demonstrate an ability to enhance it. Additionally, the average transferability of adversarial examples among quantized versions of a network can be used to estimate the transferability to quantized target networks with varying capacity and architecture.

Properties that allow or prohibit transferability of adversarial attacks among quantized networks

TL;DR

This study addresses the security implications of deploying quantized neural networks on embedded devices by examining how adversarial attacks transfer across different bitwidths. It systematically evaluates multiple attack algorithms (FGSM, JSMA, UAP, Boundary Attack, CW) on MNIST and CIFAR-10 with six quantization levels, including scenarios where source and target differ in capacity and architecture, using DoReFa-Net quantization and training-time STEs. The findings show that quantization generally reduces transferability, but certain attacks—especially UAP at higher distortion budgets—can transfer more effectively, while gradient-based attacks remain relatively brittle to quantization shifts. A key practical takeaway is that the average transferability observed between quantized variants can serve as a proxy for transferability to unseen target models with different capacity or architecture, informing robustness assessments for edge-device deployments.

Abstract

Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples. Further, these adversarial examples are found to be transferable from the source network in which they are crafted to a black-box target network. As the trend of using deep learning on embedded devices grows, it becomes relevant to study the transferability properties of adversarial examples among compressed networks. In this paper, we consider quantization as a network compression technique and evaluate the performance of transfer-based attacks when the source and target networks are quantized at different bitwidths. We explore how algorithm specific properties affect transferability by considering various adversarial example generation algorithms. Furthermore, we examine transferability in a more realistic scenario where the source and target networks may differ in bitwidth and other model-related properties like capacity and architecture. We find that although quantization reduces transferability, certain attack types demonstrate an ability to enhance it. Additionally, the average transferability of adversarial examples among quantized versions of a network can be used to estimate the transferability to quantized target networks with varying capacity and architecture.
Paper Structure (16 sections, 7 equations, 6 figures, 4 tables)

This paper contains 16 sections, 7 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Transferability of adversarial attacks among different bitwidth versions of the Resnet20 model. In each matrix, rows indicate the source networks, while columns indicate the corresponding target networks. Row and column headers specify the bitwidth of the source and target models, respectively. The source and target model IDs are labelled alongside the corresponding headers. Cell values correspond to the adversarial accuracy of the target. Higher values (darker colours) indicate less transferability, while lower values (lighter colours) indicate more transferability. The diagonal values correspond to attack performance on the source (the source and target are the same model). Each value in the "Average" column indicates the average adversarial accuracy of all target models (one complete row) against a single attack source.
  • Figure 2: Adversarial examples generated on FP Mnist A model using: UAP with $\xi = 0.6$. FGSM with $\varepsilon = 0.6$. Both images are first 10 images from the MNIST dataset.
  • Figure 3: Transferability of adversarial attacks when the source and target networks differ in capacity. The five matrices on the left column depict the transferability of each of the five attacks when the source networks are different bitwidth versions of Resnet20. The matrices on the right column depict the transferability of the same attacks when the source networks are different bitwidth versions of Resnet32. The target networks are FP Resnet44 and its quantized versions in all cases.
  • Figure 4: Transferability of adversarial attacks when the source and target networks differ in architecture. The source networks are different bitwidth versions of Resnet20 and the target networks are different bitwidth versions of Cifar A.
  • Figure 5: Adversarial samples created using FGSM, JSMA, BA, UAP, and CW attack on various bitwidths of Mnist A model.
  • ...and 1 more figures