Table of Contents
Fetching ...

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

Pin-Yu Chen, Yash Sharma, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh

TL;DR

This paper introduces EAD, an elastic-net regularized framework for crafting adversarial examples that blend L1 and L2 distortions to attack deep neural networks. By solving a C&W-like objective with an added L1 penalty via an ISTA-based optimizer, EAD generates sparse, yet effective perturbations and generalizes the strongest existing L2 attack. The authors demonstrate that L1-oriented adversaries achieve competitive ASR across MNIST, CIFAR10, and ImageNet, with notably improved transferability and complementary benefits to adversarial training and defenses. The work provides new insights into the role of L1 distortion in adversarial machine learning and security implications for DNNs.

Abstract

Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples - a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on $L_2$ and $L_\infty$ distortion metrics. However, despite the fact that $L_1$ distortion accounts for the total variation and encourages sparsity in the perturbation, little has been developed for crafting $L_1$-based adversarial examples. In this paper, we formulate the process of attacking DNNs via adversarial examples as an elastic-net regularized optimization problem. Our elastic-net attacks to DNNs (EAD) feature $L_1$-oriented adversarial examples and include the state-of-the-art $L_2$ attack as a special case. Experimental results on MNIST, CIFAR10 and ImageNet show that EAD can yield a distinct set of adversarial examples with small $L_1$ distortion and attains similar attack performance to the state-of-the-art methods in different attack scenarios. More importantly, EAD leads to improved attack transferability and complements adversarial training for DNNs, suggesting novel insights on leveraging $L_1$ distortion in adversarial machine learning and security implications of DNNs.

EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples

TL;DR

This paper introduces EAD, an elastic-net regularized framework for crafting adversarial examples that blend L1 and L2 distortions to attack deep neural networks. By solving a C&W-like objective with an added L1 penalty via an ISTA-based optimizer, EAD generates sparse, yet effective perturbations and generalizes the strongest existing L2 attack. The authors demonstrate that L1-oriented adversaries achieve competitive ASR across MNIST, CIFAR10, and ImageNet, with notably improved transferability and complementary benefits to adversarial training and defenses. The work provides new insights into the role of L1 distortion in adversarial machine learning and security implications for DNNs.

Abstract

Recent studies have highlighted the vulnerability of deep neural networks (DNNs) to adversarial examples - a visually indistinguishable adversarial image can easily be crafted to cause a well-trained model to misclassify. Existing methods for crafting adversarial examples are based on and distortion metrics. However, despite the fact that distortion accounts for the total variation and encourages sparsity in the perturbation, little has been developed for crafting -based adversarial examples. In this paper, we formulate the process of attacking DNNs via adversarial examples as an elastic-net regularized optimization problem. Our elastic-net attacks to DNNs (EAD) feature -oriented adversarial examples and include the state-of-the-art attack as a special case. Experimental results on MNIST, CIFAR10 and ImageNet show that EAD can yield a distinct set of adversarial examples with small distortion and attains similar attack performance to the state-of-the-art methods in different attack scenarios. More importantly, EAD leads to improved attack transferability and complements adversarial training for DNNs, suggesting novel insights on leveraging distortion in adversarial machine learning and security implications of DNNs.

Paper Structure

This paper contains 27 sections, 12 equations, 8 figures, 14 tables, 1 algorithm.

Figures (8)

  • Figure 1: Visual illustration of adversarial examples crafted by EAD (Algorithm \ref{['algo_EAD']}). The original example is an ostrich image selected from the ImageNet dataset (Figure \ref{['Fig_ostrich_demo']} (a)). The adversarial examples in Figure \ref{['Fig_ostrich_demo']} (b) are classified as the target class labels by the Inception-v3 model.
  • Figure 2: Comparison of EN and $L_1$ decision rules in EAD on MNIST with varying $L_1$ regularization parameter $\beta$ (average case). Comparing to the EN rule, for the same $\beta$ the $L_1$ rule attains less $L_1$ distortion but may incur more $L_2$ and $L_\infty$ distortions.
  • Figure 3: Attack success rate (average case) of the C&W method and EAD on MNIST and CIFAR10 with respect to varying temperature parameter $T$ for defensive distillation. Both methods can successfully break defensive distillation.
  • Figure 4: Attack transferability (average case) from the undefended network to the defensively distilled network on MNIST by varying $\kappa$. EAD can attain nearly 99% attack success rate (ASR) when $\kappa=50$, whereas the top ASR of the C&W attack is nearly 88% when $\kappa=40$.
  • Figure 5: Comparison of EN and $L_1$ decision rules in EAD on CIFAR10 with varying $L_1$ regularization parameter $\beta$ (average case). Comparing to the EN rule, for the same $\beta$ the $L_1$ rule attains less $L_1$ distortion but may incur more $L_2$ and $L_\infty$ distortions.
  • ...and 3 more figures