Table of Contents
Fetching ...

Mitigating Accuracy-Robustness Trade-off via Balanced Multi-Teacher Adversarial Distillation

Shiji Zhao, Xizhe Wang, Xingxing Wei

TL;DR

The paper tackles the intrinsic accuracy-robustness trade-off in Adversarial Training by introducing Balanced Multi-Teacher Adversarial Distillation (B-MTARD), which leverages a clean teacher for natural inputs and a robust teacher for adversarial inputs. It proposes two novel balancing mechanisms: Entropy-Based Balance, which aligns teachers’ knowledge scales by adjusting temperatures to equalize information entropy, and Normalization Loss Balance, which equalizes the student’s learning speeds from different teachers through adaptive loss weighting. The approach is validated on CIFAR-10/100 and Tiny-ImageNet, achieving state-of-the-art Weighted Robust Accuracy across multiple white-box and black-box attacks and outperforming prior ARD and multi-teacher distillation methods. The results demonstrate the practical viability of decoupling accuracy and robustness and offer a framework potentially extendable to other multi-objective learning problems.

Abstract

Adversarial Training is a practical approach for improving the robustness of deep neural networks against adversarial attacks. Although bringing reliable robustness, the performance towards clean examples is negatively affected after Adversarial Training, which means a trade-off exists between accuracy and robustness. Recently, some studies have tried to use knowledge distillation methods in Adversarial Training, achieving competitive performance in improving the robustness but the accuracy for clean samples is still limited. In this paper, to mitigate the accuracy-robustness trade-off, we introduce the Balanced Multi-Teacher Adversarial Robustness Distillation (B-MTARD) to guide the model's Adversarial Training process by applying a strong clean teacher and a strong robust teacher to handle the clean examples and adversarial examples, respectively. During the optimization process, to ensure that different teachers show similar knowledge scales, we design the Entropy-Based Balance algorithm to adjust the teacher's temperature and keep the teachers' information entropy consistent. Besides, to ensure that the student has a relatively consistent learning speed from multiple teachers, we propose the Normalization Loss Balance algorithm to adjust the learning weights of different types of knowledge. A series of experiments conducted on three public datasets demonstrate that B-MTARD outperforms the state-of-the-art methods against various adversarial attacks.

Mitigating Accuracy-Robustness Trade-off via Balanced Multi-Teacher Adversarial Distillation

TL;DR

The paper tackles the intrinsic accuracy-robustness trade-off in Adversarial Training by introducing Balanced Multi-Teacher Adversarial Distillation (B-MTARD), which leverages a clean teacher for natural inputs and a robust teacher for adversarial inputs. It proposes two novel balancing mechanisms: Entropy-Based Balance, which aligns teachers’ knowledge scales by adjusting temperatures to equalize information entropy, and Normalization Loss Balance, which equalizes the student’s learning speeds from different teachers through adaptive loss weighting. The approach is validated on CIFAR-10/100 and Tiny-ImageNet, achieving state-of-the-art Weighted Robust Accuracy across multiple white-box and black-box attacks and outperforming prior ARD and multi-teacher distillation methods. The results demonstrate the practical viability of decoupling accuracy and robustness and offer a framework potentially extendable to other multi-objective learning problems.

Abstract

Adversarial Training is a practical approach for improving the robustness of deep neural networks against adversarial attacks. Although bringing reliable robustness, the performance towards clean examples is negatively affected after Adversarial Training, which means a trade-off exists between accuracy and robustness. Recently, some studies have tried to use knowledge distillation methods in Adversarial Training, achieving competitive performance in improving the robustness but the accuracy for clean samples is still limited. In this paper, to mitigate the accuracy-robustness trade-off, we introduce the Balanced Multi-Teacher Adversarial Robustness Distillation (B-MTARD) to guide the model's Adversarial Training process by applying a strong clean teacher and a strong robust teacher to handle the clean examples and adversarial examples, respectively. During the optimization process, to ensure that different teachers show similar knowledge scales, we design the Entropy-Based Balance algorithm to adjust the teacher's temperature and keep the teachers' information entropy consistent. Besides, to ensure that the student has a relatively consistent learning speed from multiple teachers, we propose the Normalization Loss Balance algorithm to adjust the learning weights of different types of knowledge. A series of experiments conducted on three public datasets demonstrate that B-MTARD outperforms the state-of-the-art methods against various adversarial attacks.
Paper Structure (22 sections, 1 theorem, 33 equations, 6 figures, 11 tables, 1 algorithm)

This paper contains 22 sections, 1 theorem, 33 equations, 6 figures, 11 tables, 1 algorithm.

Key Result

Theorem 1

The teacher’s knowledge scale $K^{T}$ is negatively related to the information entropy $H(P^{T})$ of teacher’s predicted distribution and have a relationship as follows: where$P^{T}=\{p_{1}^{T}(x),..., p_{C}^{T}(x)\}$ denote the teacher model $T$ predicted distribution, and $logC$ is a constant.

Figures (6)

  • Figure 1: The framework of our Balanced Multi-Teacher Adversarial Robustness Distillation (B-MTARD). In the training process, we first generate adversarial examples from the student. Then, with the guidance of the clean teacher and the robust teacher, the student is trained on clean examples and adversarial examples, respectively. Here, we apply the Entropy-Based Balance algorithm to adjust the teacher's knowledge scales; In addition, we use the Normalization Loss Balance algorithm to balance the student's knowledge learning speed from different teachers.
  • Figure 2: Schematic diagram of the same logits but dealing with different temperatures $\tau$ and getting different predicted probabilities $p$. From the figure, the temperature $\tau$ can impact the model prediction and can further impact the information entropy $H$.
  • Figure 3: Ablation study with ResNet-18 student trained using variants of our B-MTARD and Baseline method on CIFAR-10. NLB and EBB represent Normalization Loss Balance (NLB) and Entropy-Based Balance (EBB). B-MTARD is our final version, which represents Baseline+NLB+EBB. The Baseline+NLB is the result in our ECCV version zhao2022enhanced. All the results are the best checkpoint based on W-Robust Acc.
  • Figure 4: The total loss $L_{total}$ with the ResNet-18 student trained using variants of RSLAD, Baseline, and Baseline+NLB on CIFAR-10. NLB is abbreviation of Normalization Loss Balance. The y-axis is the $L_{total}$ in the training epoch x. The left is the change curve of $L_{total}$ in the whole training process, the right is the curve of $L_{total}$ in the final 60 epochs.
  • Figure 5: The relative training loss $\tilde{L}$ with ResNet-18 student on CIFAR-10. The left is Baseline, and the right is Baseline+NLB. NLB represents Normalization Loss Balance. The x-axis means the training epochs, and the y-axis is the relative loss $\tilde{L}_{adv}$ and $\tilde{L}_{nat}$ in the training epoch x.
  • ...and 1 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Proof 1
  • Proof 2
  • Proof 3