Two Heads are Better than One: Robust Learning Meets Multi-branch Models
Zongyuan Zhang, Qingwen Bu, Tianyang Duan, Zheng Lin, Yuhao Qing, Zihan Fang, Heming Cui, Dong Huang
TL;DR
The paper addresses adversarial vulnerability by shifting focus from data augmentation to model-space diversity. It introduces BORT, a multi-branch adversarial training framework with a branch-orthogonal loss that enforces orthogonal solution spaces among branches, trained sequentially with PGD-based adversaries. The approach achieves state-of-the-art robust accuracy on CIFAR-10 (67.3%), CIFAR-100 (41.5%), and competitive SVHN performance without using additional data, outperforming data-augmented baselines. This work highlights a practical model-centric complement to data-centric defenses, offering substantial robustness benefits with moderate training cost.
Abstract
Deep neural networks (DNNs) are vulnerable to adversarial examples, in which DNNs are misled to false outputs due to inputs containing imperceptible perturbations. Adversarial training, a reliable and effective method of defense, may significantly reduce the vulnerability of neural networks and becomes the de facto standard for robust learning. While many recent works practice the data-centric philosophy, such as how to generate better adversarial examples or use generative models to produce additional training data, we look back to the models themselves and revisit the adversarial robustness from the perspective of deep feature distribution as an insightful complementarity. In this paper, we propose \textit{Branch Orthogonality adveRsarial Training} (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training. To practice our design idea of integrating multiple orthogonal solution spaces, we leverage a simple multi-branch neural network and propose a corresponding loss function, branch-orthogonal loss, to make each solution space of the multi-branch model orthogonal. We evaluate our approach on CIFAR-10, CIFAR-100 and SVHN against $\ell_{\infty}$ norm-bounded perturbations of size $ε= 8/255$, respectively. Exhaustive experiments are conducted to show that our method goes beyond all state-of-the-art methods without any tricks. Compared to all methods that do not use additional data for training, our models achieve 67.3\% and 41.5\% robust accuracy on CIFAR-10 and CIFAR-100 (improving upon the state-of-the-art by +7.23\% and +9.07\%).
