Adversarial Training via Adaptive Knowledge Amalgamation of an Ensemble of Teachers
Shayan Mohajer Hamidi, Linfeng Ye
TL;DR
This work tackles the dual challenges of achieving adversarial robustness in small neural networks and generalizing robustness to unseen attack types. It introduces AT-AKA, which uses an ensemble of teachers trained on SVGD-generated diverse adversarial samples and adaptively amalgamates their logits to train a generalized robust student. The authors extend the framework with CAT-AKA, a collaborative online KD variant that shares knowledge across multiple students. Empirical results on CIFAR-10/100 against FGSM, PGD, and AutoAttack show that AT-AKA and CAT-AKA deliver superior adversarial robustness with minimal losses in clean accuracy, outperforming several state-of-the-art adversarial training and robustness-distillation methods.
Abstract
Adversarial training (AT) is a popular method for training robust deep neural networks (DNNs) against adversarial attacks. Yet, AT suffers from two shortcomings: (i) the robustness of DNNs trained by AT is highly intertwined with the size of the DNNs, posing challenges in achieving robustness in smaller models; and (ii) the adversarial samples employed during the AT process exhibit poor generalization, leaving DNNs vulnerable to unforeseen attack types. To address these dual challenges, this paper introduces adversarial training via adaptive knowledge amalgamation of an ensemble of teachers (AT-AKA). In particular, we generate a diverse set of adversarial samples as the inputs to an ensemble of teachers; and then, we adaptively amalgamate the logtis of these teachers to train a generalized-robust student. Through comprehensive experiments, we illustrate the superior efficacy of AT-AKA over existing AT methods and adversarial robustness distillation techniques against cutting-edge attacks, including AutoAttack.
