Table of Contents
Fetching ...

Efficient Defenses Against Adversarial Attacks

Valentina Zantedeschi, Maria-Irina Nicolae, Ambrish Rawat

TL;DR

<3-5 sentence high-level summary>Adversarial examples threaten DNN reliability across critical applications. This work introduces two attack-agnostic defenses—bounded RELU activations (BRELU) and Gaussian Data Augmentation (GDA)—to stabilize forward propagation and smooth decision boundaries, respectively. The approach achieves robust performance against a broad range of white-box and black-box attacks on MNIST and CIFAR-10, while incurring minimal training overhead and preserving accuracy on clean data. This has practical significance for deploying secure DNNs in real-world settings where defense cost and transferability are key considerations.

Abstract

Following the recent adoption of deep neural networks (DNN) accross a wide range of applications, adversarial attacks against these models have proven to be an indisputable threat. Adversarial samples are crafted with a deliberate intention of undermining a system. In the case of DNNs, the lack of better understanding of their working has prevented the development of efficient defenses. In this paper, we propose a new defense method based on practical observations which is easy to integrate into models and performs better than state-of-the-art defenses. Our proposed solution is meant to reinforce the structure of a DNN, making its prediction more stable and less likely to be fooled by adversarial samples. We conduct an extensive experimental study proving the efficiency of our method against multiple attacks, comparing it to numerous defenses, both in white-box and black-box setups. Additionally, the implementation of our method brings almost no overhead to the training procedure, while maintaining the prediction performance of the original model on clean samples.

Efficient Defenses Against Adversarial Attacks

TL;DR

<3-5 sentence high-level summary>Adversarial examples threaten DNN reliability across critical applications. This work introduces two attack-agnostic defenses—bounded RELU activations (BRELU) and Gaussian Data Augmentation (GDA)—to stabilize forward propagation and smooth decision boundaries, respectively. The approach achieves robust performance against a broad range of white-box and black-box attacks on MNIST and CIFAR-10, while incurring minimal training overhead and preserving accuracy on clean data. This has practical significance for deploying secure DNNs in real-world settings where defense cost and transferability are key considerations.

Abstract

Following the recent adoption of deep neural networks (DNN) accross a wide range of applications, adversarial attacks against these models have proven to be an indisputable threat. Adversarial samples are crafted with a deliberate intention of undermining a system. In the case of DNNs, the lack of better understanding of their working has prevented the development of efficient defenses. In this paper, we propose a new defense method based on practical observations which is easy to integrate into models and performs better than state-of-the-art defenses. Our proposed solution is meant to reinforce the structure of a DNN, making its prediction more stable and less likely to be fooled by adversarial samples. We conduct an extensive experimental study proving the efficiency of our method against multiple attacks, comparing it to numerous defenses, both in white-box and black-box setups. Additionally, the implementation of our method brings almost no overhead to the training procedure, while maintaining the prediction performance of the original model on clean samples.

Paper Structure

This paper contains 19 sections, 11 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Minimal perturbations needed for fooling a model on the first ten images from MNIST. The original examples are marked by the green rectangle. With our defenses, the attack becomes visually detectable.
  • Figure 2: Classification boundaries and confidence levels on toy datasets. We compare state-of-the-art data augmentation techniques for hardening learning models, in this case a soft-max neural network with two dense hidden layers and RELU activation function. The decision boundary can be identified through the colors and the confidence level contours are marked with black lines. The original points and the additional ones (smaller) are drawn with the label and color corresponding to their class.
  • Figure 3: Accuracy on FGSM white-box attack with respect to $\epsilon$ for different architectures on MNIST.
  • Figure 4: Comparison of different defenses against white-box and black-box attacks on MNIST. For black-box attacks, the adversarial examples are crafted using the ResNet model, without any defense.
  • Figure 5: Comparison of different defenses against white-box and black-box attacks on CIFAR10. For black-box attacks, the adversarial examples are crafted using the ResNet model, without any defense. Note that the adversarial examples crafted with Random + FGSM and $0 < \epsilon \leq 0.05$ correspond to Gaussian noise for $\alpha = 0.05$.