Table of Contents
Fetching ...

Adversarial Training via Adaptive Knowledge Amalgamation of an Ensemble of Teachers

Shayan Mohajer Hamidi, Linfeng Ye

TL;DR

This work tackles the dual challenges of achieving adversarial robustness in small neural networks and generalizing robustness to unseen attack types. It introduces AT-AKA, which uses an ensemble of teachers trained on SVGD-generated diverse adversarial samples and adaptively amalgamates their logits to train a generalized robust student. The authors extend the framework with CAT-AKA, a collaborative online KD variant that shares knowledge across multiple students. Empirical results on CIFAR-10/100 against FGSM, PGD, and AutoAttack show that AT-AKA and CAT-AKA deliver superior adversarial robustness with minimal losses in clean accuracy, outperforming several state-of-the-art adversarial training and robustness-distillation methods.

Abstract

Adversarial training (AT) is a popular method for training robust deep neural networks (DNNs) against adversarial attacks. Yet, AT suffers from two shortcomings: (i) the robustness of DNNs trained by AT is highly intertwined with the size of the DNNs, posing challenges in achieving robustness in smaller models; and (ii) the adversarial samples employed during the AT process exhibit poor generalization, leaving DNNs vulnerable to unforeseen attack types. To address these dual challenges, this paper introduces adversarial training via adaptive knowledge amalgamation of an ensemble of teachers (AT-AKA). In particular, we generate a diverse set of adversarial samples as the inputs to an ensemble of teachers; and then, we adaptively amalgamate the logtis of these teachers to train a generalized-robust student. Through comprehensive experiments, we illustrate the superior efficacy of AT-AKA over existing AT methods and adversarial robustness distillation techniques against cutting-edge attacks, including AutoAttack.

Adversarial Training via Adaptive Knowledge Amalgamation of an Ensemble of Teachers

TL;DR

This work tackles the dual challenges of achieving adversarial robustness in small neural networks and generalizing robustness to unseen attack types. It introduces AT-AKA, which uses an ensemble of teachers trained on SVGD-generated diverse adversarial samples and adaptively amalgamates their logits to train a generalized robust student. The authors extend the framework with CAT-AKA, a collaborative online KD variant that shares knowledge across multiple students. Empirical results on CIFAR-10/100 against FGSM, PGD, and AutoAttack show that AT-AKA and CAT-AKA deliver superior adversarial robustness with minimal losses in clean accuracy, outperforming several state-of-the-art adversarial training and robustness-distillation methods.

Abstract

Adversarial training (AT) is a popular method for training robust deep neural networks (DNNs) against adversarial attacks. Yet, AT suffers from two shortcomings: (i) the robustness of DNNs trained by AT is highly intertwined with the size of the DNNs, posing challenges in achieving robustness in smaller models; and (ii) the adversarial samples employed during the AT process exhibit poor generalization, leaving DNNs vulnerable to unforeseen attack types. To address these dual challenges, this paper introduces adversarial training via adaptive knowledge amalgamation of an ensemble of teachers (AT-AKA). In particular, we generate a diverse set of adversarial samples as the inputs to an ensemble of teachers; and then, we adaptively amalgamate the logtis of these teachers to train a generalized-robust student. Through comprehensive experiments, we illustrate the superior efficacy of AT-AKA over existing AT methods and adversarial robustness distillation techniques against cutting-edge attacks, including AutoAttack.
Paper Structure (27 sections, 9 equations, 3 figures, 5 tables)

This paper contains 27 sections, 9 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: A two-dimensional caricature of the classification boundaries learnt by different methods: (a) an ordinary learning method using the clean samples; (b) adversarial training where PGD is used to generate adversarial samples; (c) $T_1$ via first set of adversarial samples; (d) $T_2$ via second set of adversarial samples; and (e) amalgamation of $T_1$ and $T_2$.
  • Figure 2: The AT-AKA framework. The teachers and the student are denoted by $\{T_i\}_{i \in [n]}$ and $S$, respectively. All the $\{T_i\}_{i \in [n]}$ and $S$ are trained from scratch. The clean sample $\boldsymbol{x}$ is fed to the student, however, $n$ distinct adversarial samples generated by SVGD are fed to the teachers. Then, the logits of the teachers are amalgamated, and the resulting logit is used to distill knowledge into the student.
  • Figure 3: The CAT-AKA framework. All the models are students, denoted by $\{S_i\}_{i \in [n]}$. All the students are trained from scratch. The clean sample $\boldsymbol{x}$ is fed to the SVGD block to generate $n$ distinct adversarial samples. Then, the logits of the students are amalgamated, and the resulting logit is used as a supervision for the students.

Theorems & Definitions (1)

  • remark thmcounterremark