RegMix: Adversarial Mutual and Generalization Regularization for Enhancing DNN Robustness
Zhenyu Liu, Varun Ojha
TL;DR
RegMix addresses adversarial robustness by introducing Adversarial Mutual Regularization (AMR) and Adversarial Generalization Regularization (AGR) to adversarial training.AMR employs a pair of weighted KL-divergence terms between final and initial adversarial outputs to allocate primary and auxiliary roles, while AGR adds a clean-output target to improve generalization under attack.Experiments on CIFAR-10, CIFAR-100, and Tiny ImageNet show that RegMix improves robustness against PGD and AutoAttack, with faster convergence and consistent gains across architectures and perturbation budgets.The approach yields smoother loss landscapes and probability distributions closer to ground truth, suggesting practical impact for deploying robust DNNs in security-sensitive settings.Overall, RegMix provides a principled, distillation-inspired framework that enhances defense strength against both similar-scale and stronger adversarial perturbations.
Abstract
Adversarial training is the most effective defense against adversarial attacks. The effectiveness of the adversarial attacks has been on the design of its loss function and regularization term. The most widely used loss function in adversarial training is cross-entropy and mean squared error (MSE) as its regularization objective. However, MSE enforces overly uniform optimization between two output distributions during training, which limits its robustness in adversarial training scenarios. To address this issue, we revisit the idea of mutual learning (originally designed for knowledge distillation) and propose two novel regularization strategies tailored for adversarial training: (i) weighted adversarial mutual regularization and (ii) adversarial generalization regularization. In the former, we formulate a decomposed adversarial mutual Kullback-Leibler divergence (KL-divergence) loss, which allows flexible control over the optimization process by assigning unequal weights to the main and auxiliary objectives. In the latter, we introduce an additional clean target distribution into the adversarial training objective, improving generalization and enhancing model robustness. Extensive experiments demonstrate that our proposed methods significantly improve adversarial robustness compared to existing regularization-based approaches.
