Table of Contents
Fetching ...

DataFreeShield: Defending Adversarial Attacks without Training Data

Hyeyoon Lee, Kanghyun Choi, Dain Kwon, Sunjong Park, Mayoore Selvarasa Jaiswal, Noseong Park, Jonghyun Choi, Jinho Lee

TL;DR

Through extensive validation, it is shown that DataFreeShield outperforms baselines, demonstrating that the proposed method sets the first entirely data-free solution for the adversarial robustness problem.

Abstract

Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issues, while only the pretrained weight is available to the public. In such scenarios, existing methods that assume accessibility to the original data become inapplicable. Thus we investigate the pivotal problem of data-free adversarial robustness, where we try to achieve adversarial robustness without accessing any real data. Through a preliminary study, we highlight the severity of the problem by showing that robustness without the original dataset is difficult to achieve, even with similar domain datasets. To address this issue, we propose DataFreeShield, which tackles the problem from two perspectives: surrogate dataset generation and adversarial training using the generated data. Through extensive validation, we show that DataFreeShield outperforms baselines, demonstrating that the proposed method sets the first entirely data-free solution for the adversarial robustness problem.

DataFreeShield: Defending Adversarial Attacks without Training Data

TL;DR

Through extensive validation, it is shown that DataFreeShield outperforms baselines, demonstrating that the proposed method sets the first entirely data-free solution for the adversarial robustness problem.

Abstract

Recent advances in adversarial robustness rely on an abundant set of training data, where using external or additional datasets has become a common setting. However, in real life, the training data is often kept private for security and privacy issues, while only the pretrained weight is available to the public. In such scenarios, existing methods that assume accessibility to the original data become inapplicable. Thus we investigate the pivotal problem of data-free adversarial robustness, where we try to achieve adversarial robustness without accessing any real data. Through a preliminary study, we highlight the severity of the problem by showing that robustness without the original dataset is difficult to achieve, even with similar domain datasets. To address this issue, we propose DataFreeShield, which tackles the problem from two perspectives: surrogate dataset generation and adversarial training using the generated data. Through extensive validation, we show that DataFreeShield outperforms baselines, demonstrating that the proposed method sets the first entirely data-free solution for the adversarial robustness problem.
Paper Structure (38 sections, 9 equations, 14 figures, 26 tables, 1 algorithm)

This paper contains 38 sections, 9 equations, 14 figures, 26 tables, 1 algorithm.

Figures (14)

  • Figure 1: Motivational experiment using biomedical datasets medmnistv2. (a) demonstrates the problem scenario where adversarial threat prevails for models pretrained with private datasets. (b) plots the results when adversarial training is done with a similar or public dataset.
  • Figure 2: Procedure of the proposed method. (a) denotes synthetic data generation using the proposed DSS. (b) shows adversarial training of target model $S_{\theta}$ using $\mathcal{L}_{DFShield}$ and GradRefine. The pseudo-code is provided in \ref{['sec:supp:pseudocode']}.
  • Figure 3: Comparison of synthesis methods using the same number of 2-d data. The conventional fixed coefficient setting leads to limited diversity, while DSS generates diverse samples.
  • Figure 4: (a) demonstrates a conceptual image of generalization gap between synthetic and real data. (b)-(e) shows loss surface visualization on ResNet-20 with CIFAR-10 showing that GradRefine achieves flatter loss surfaces. Each figure represents different training losses with or without GradRefine. We use normalized random direction for $x$,$y$ axis, following li2018visualizing.
  • Figure 5: Comparing performance using varying number of samples for training. Left denotes AutoAttack accuracy while the right denotes PGD-10 accuracy.
  • ...and 9 more figures