Table of Contents
Fetching ...

Towards Robust Neural Networks via Orthogonal Diversity

Kun Fang, Qinghua Tao, Yingwen Wu, Tao Li, Jia Cai, Feipeng Cai, Xiaolin Huang, Jie Yang

TL;DR

The paper tackles adversarial robustness by moving beyond data augmentation to a model-centric strategy called DIO (Diversity via Orthogonality). It inserts $L$ mutually orthogonal heads after a shared backbone and enforces diversity with an orthogonality loss $\mathcal{L}_o$ and a margin-based distance loss $\mathcal{L}_d$, optimizing the joint objective $\mathcal{L}_c + \alpha\mathcal{L}_o + \beta\mathcal{L}_d$. Empirically, DIO improves robustness against both white-box and black-box attacks across CIFAR10/100 and TinyImageNet, and its effectiveness is amplified when combined with data-augmentation defenses like AT, TRADES, GAIRAT, LBGAT, AWP, or DDPM-generated data. Ablation studies confirm the complementary roles of $\mathcal{L}_o$ and $\mathcal{L}_d$. The work highlights the value of model-centered diversification for robustness, while noting costs and integration considerations for broader applicability.

Abstract

Deep Neural Networks (DNNs) are vulnerable to invisible perturbations on the images generated by adversarial attacks, which raises researches on the adversarial robustness of DNNs. A series of methods represented by the adversarial training and its variants have proven as one of the most effective techniques in enhancing the DNN robustness. Generally, adversarial training focuses on enriching the training data by involving perturbed data. Such data augmentation effect of the involved perturbed data in adversarial training does not contribute to the robustness of DNN itself and usually suffers from clean accuracy drop. Towards the robustness of DNN itself, we in this paper propose a novel defense that aims at augmenting the model in order to learn features that are adaptive to diverse inputs, including adversarial examples. More specifically, to augment the model, multiple paths are embedded into the network, and an orthogonality constraint is imposed on these paths to guarantee the diversity among them. A margin-maximization loss is then designed to further boost such DIversity via Orthogonality (DIO). In this way, the proposed DIO augments the model and enhances the robustness of DNN itself as the learned features can be corrected by these mutually-orthogonal paths. Extensive empirical results on various data sets, structures and attacks verify the stronger adversarial robustness of the proposed DIO utilizing model augmentation. Besides, DIO can also be flexibly combined with different data augmentation techniques (e.g., TRADES and DDPM), further promoting robustness gains.

Towards Robust Neural Networks via Orthogonal Diversity

TL;DR

The paper tackles adversarial robustness by moving beyond data augmentation to a model-centric strategy called DIO (Diversity via Orthogonality). It inserts mutually orthogonal heads after a shared backbone and enforces diversity with an orthogonality loss and a margin-based distance loss , optimizing the joint objective . Empirically, DIO improves robustness against both white-box and black-box attacks across CIFAR10/100 and TinyImageNet, and its effectiveness is amplified when combined with data-augmentation defenses like AT, TRADES, GAIRAT, LBGAT, AWP, or DDPM-generated data. Ablation studies confirm the complementary roles of and . The work highlights the value of model-centered diversification for robustness, while noting costs and integration considerations for broader applicability.

Abstract

Deep Neural Networks (DNNs) are vulnerable to invisible perturbations on the images generated by adversarial attacks, which raises researches on the adversarial robustness of DNNs. A series of methods represented by the adversarial training and its variants have proven as one of the most effective techniques in enhancing the DNN robustness. Generally, adversarial training focuses on enriching the training data by involving perturbed data. Such data augmentation effect of the involved perturbed data in adversarial training does not contribute to the robustness of DNN itself and usually suffers from clean accuracy drop. Towards the robustness of DNN itself, we in this paper propose a novel defense that aims at augmenting the model in order to learn features that are adaptive to diverse inputs, including adversarial examples. More specifically, to augment the model, multiple paths are embedded into the network, and an orthogonality constraint is imposed on these paths to guarantee the diversity among them. A margin-maximization loss is then designed to further boost such DIversity via Orthogonality (DIO). In this way, the proposed DIO augments the model and enhances the robustness of DNN itself as the learned features can be corrected by these mutually-orthogonal paths. Extensive empirical results on various data sets, structures and attacks verify the stronger adversarial robustness of the proposed DIO utilizing model augmentation. Besides, DIO can also be flexibly combined with different data augmentation techniques (e.g., TRADES and DDPM), further promoting robustness gains.

Paper Structure

This paper contains 30 sections, 1 theorem, 16 equations, 8 figures, 7 tables, 1 algorithm.

Key Result

Lemma 1

Given two vectors $\mathbf{u},\mathbf{v}\in\mathcal{R}^d$ independently sampled from $\mathcal{N}(0,1/d)$ and a constant $\varepsilon\in(0,1)$, we have

Figures (8)

  • Figure 1: Comparisons of the architectures and the learned features between a regular network and the proposed DIO.
  • Figure 2: Details of the multiple heads of DIO. The backbone is omitted for simplicity.
  • Figure 3: An illustration on the training and inference of the proposed adversarial defense DIO with a 2-head network structure as an example. During training, all the heads are involved into the forward propagation. The orthogonality constraint $\mathcal{L}_o$ is calculated on the 2 heads, while the distance constraint $\mathcal{L}_d$ is determined based on the 2 heads and the learned features $\mathbf{z}$ from the backbone. The cross-entropy loss is omitted here for simplicity. In inference, only one head is randomly selected to give predictions.
  • Figure 4: Results of ablation studies. These models of PRN18 are adversarially trained on CIFAR10 and CIFAR100 and are evaluated against the PGD-20 and PGD-100 attacks.
  • Figure 5: Results of two adaptive attacks: Each bar in the charts indicates the robust accuracy (%) of each path in DIO ("DIO+AT" of WRN34X10) against the white-box PGD-20 (the salmon bars) and PGD-100 (the cyan bars) attacks. The left chart illustrates the results of Adapt-A1, which averages the losses of all paths to compute the gradients. The right chart illustrates the results of Adapt-A2, which is executed directly on the each individual network. The salmon and cyan dashed lines indicate the baseline results of the standard DNN ("AT" of WRN34X10) against PGD-20 and PGD-100, respectively.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Lemma 1