Table of Contents
Fetching ...

An Empirical Study of Aegis

Daniel Saragih, Paridhi Goel, Tejas Balaji, Alyssa Li

TL;DR

This paper empirically studies the Aegis defense against bit-flip attacks on neural networks, evaluating baseline models and several fine-tuned variants that incorporate robustness training and data augmentation. By testing on MNIST and CIFAR-10 with ResNet32 and VGG16, it reveals that robustness training often harms generalization and adversarial robustness outside the specific bit-flip threat, while data augmentation can provide broader protection as evidenced by improved performance under Proflip and FGSM perturbations. The findings highlight a tradeoff between targeted BFAs defense and general adversarial resilience, and they show that DESDN’s uniform exiting may degrade on simpler, low-entropy datasets. Overall, the work informs design choices for practical defenses against BFAs and points to future work on broader datasets and attacks, as well as improvements to Aegis components.

Abstract

Bit flipping attacks are one class of attacks on neural networks with numerous defense mechanisms invented to mitigate its potency. Due to the importance of ensuring the robustness of these defense mechanisms, we perform an empirical study on the Aegis framework. We evaluate the baseline mechanisms of Aegis on low-entropy data (MNIST), and we evaluate a pre-trained model with the mechanisms fine-tuned on MNIST. We also compare the use of data augmentation to the robustness training of Aegis, and how Aegis performs under other adversarial attacks, such as the generation of adversarial examples. We find that both the dynamic-exit strategy and robustness training of Aegis has some drawbacks. In particular, we see drops in accuracy when testing on perturbed data, and on adversarial examples, as compared to baselines. Moreover, we found that the dynamic exit-strategy loses its uniformity when tested on simpler datasets. The code for this project is available on GitHub.

An Empirical Study of Aegis

TL;DR

This paper empirically studies the Aegis defense against bit-flip attacks on neural networks, evaluating baseline models and several fine-tuned variants that incorporate robustness training and data augmentation. By testing on MNIST and CIFAR-10 with ResNet32 and VGG16, it reveals that robustness training often harms generalization and adversarial robustness outside the specific bit-flip threat, while data augmentation can provide broader protection as evidenced by improved performance under Proflip and FGSM perturbations. The findings highlight a tradeoff between targeted BFAs defense and general adversarial resilience, and they show that DESDN’s uniform exiting may degrade on simpler, low-entropy datasets. Overall, the work informs design choices for practical defenses against BFAs and points to future work on broader datasets and attacks, as well as improvements to Aegis components.

Abstract

Bit flipping attacks are one class of attacks on neural networks with numerous defense mechanisms invented to mitigate its potency. Due to the importance of ensuring the robustness of these defense mechanisms, we perform an empirical study on the Aegis framework. We evaluate the baseline mechanisms of Aegis on low-entropy data (MNIST), and we evaluate a pre-trained model with the mechanisms fine-tuned on MNIST. We also compare the use of data augmentation to the robustness training of Aegis, and how Aegis performs under other adversarial attacks, such as the generation of adversarial examples. We find that both the dynamic-exit strategy and robustness training of Aegis has some drawbacks. In particular, we see drops in accuracy when testing on perturbed data, and on adversarial examples, as compared to baselines. Moreover, we found that the dynamic exit-strategy loses its uniformity when tested on simpler datasets. The code for this project is available on GitHub.
Paper Structure (12 sections, 1 equation, 6 figures, 3 tables)

This paper contains 12 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Base model (e.g. Resnet32 or VGG16) equipped with internal classifiers $(C_1, \ldots, C_N)$ for early exiting; figure retrieved from the original paper wang2023aegis.
  • Figure 2: The ROB mechanism. We create a copy of the model, $\hat{M}$, referred to as the flipped model, and flip vulnerable bits of $\hat{M}$ to simulate an attack. Figure retrieved from the original paper wang2023aegis.
  • Figure 3: Number of exits per layer for R-MNIST and R-CIFAR models
  • Figure 4: Number of exits per layer for V-MNIST and V-CIFAR models
  • Figure 5: Epsilon vs Accuracy for fine-tuned R-MNIST and R-CIFAR models.
  • ...and 1 more figures