Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models

Kaikang Zhao; Xi Chen; Wei Huang; Liuxin Ding; Xianglong Kong; Fan Zhang

Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models

Kaikang Zhao, Xi Chen, Wei Huang, Liuxin Ding, Xianglong Kong, Fan Zhang

TL;DR

This work identifies second-order gradients, which depict the loss curvature, as a key factor in adversarial robustness and introduces a novel regularizer to train multiple more-diverse low-curvature network models.

Abstract

The integration of an ensemble of deep learning models has been extensively explored to enhance defense against adversarial attacks. The diversity among sub-models increases the attack cost required to deceive the majority of the ensemble, thereby improving the adversarial robustness. While existing approaches mainly center on increasing diversity in feature representations or dispersion of first-order gradients with respect to input, the limited correlation between these diversity metrics and adversarial robustness constrains the performance of ensemble adversarial defense. In this work, we aim to enhance ensemble diversity by reducing attack transferability. We identify second-order gradients, which depict the loss curvature, as a key factor in adversarial robustness. Computing the Hessian matrix involved in second-order gradients is computationally expensive. To address this, we approximate the Hessian-vector product using differential approximation. Given that low curvature provides better robustness, our ensemble model was designed to consider the influence of curvature among different sub-models. We introduce a novel regularizer to train multiple more-diverse low-curvature network models. Extensive experiments across various datasets demonstrate that our ensemble model exhibits superior robustness against a range of attacks, underscoring the effectiveness of our approach.

Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models

TL;DR

Abstract

Paper Structure (12 sections, 1 theorem, 12 equations, 7 figures, 4 tables)

This paper contains 12 sections, 1 theorem, 12 equations, 7 figures, 4 tables.

Introduction
Related Work
Methodology
Training Strategies for Ensemble Model
EDLCM
Analysis of Ensemble Adversarial Defense
Experiments
Experimental Setup
Results under White-box Attacks
Results under Black-box Attacks
Ablation Tests
Conclusion

Key Result

Theorem 1

moosavi2019robustness Let threshold t represent the boundary value of the loss function and define x as such that $c=t-\mathcal{L}(x) \geq 0$. If $\nu=\lambda_{max}(H) \geq 0$ and $u$ is the eigenvector corresponding to $\nu$, then

Figures (7)

Figure 1: Adversarial Subspace
Figure 2: Heterogeneous Model Diversification
Figure 3: Adversarial examples generated on TinyImageNet under BIM
Figure 4: Comparative Analysis of Defense Performance for Various Ensemble Defense Strategies Against Type 1 Black-Box Attacks on CIFAR-100
Figure 5: Comparative Analysis of Defense Performance for Various Ensemble Defense Strategies Against Type 2 Black-Box Attacks on CIFAR-100
...and 2 more figures

Theorems & Definitions (1)

Theorem 1

Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models

TL;DR

Abstract

Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (1)