Revisiting Adversarial Training at Scale

Zeyu Wang; Xianhang Li; Hongru Zhu; Cihang Xie

Revisiting Adversarial Training at Scale

Zeyu Wang, Xianhang Li, Hongru Zhu, Cihang Xie

TL;DR

Revisiting Adversarial Training at Scale demonstrates that adversarial training can be scaled to foundation-model regimes by coupling a two-stage coarse-to-fine training schedule with weak-to-strong attack progression and CLIP-enabled learning on web-scale data. The AdvXL framework achieves new state-of-the-art robust accuracy on ImageNet-1K under AutoAttack, including significant gains in $l_{\infty}$, $l_2$, and $l_1$ robustness when trained on DataComp-1B with a ViT-g backbone. Key contributions include (i) a practical two-stage training protocol that cuts compute while preserving robustness, (ii) token-reduction strategies validated for efficiency, (iii) integration of CLIP embeddings to leverage web data, and (iv) comprehensive scaling studies across model size, data scale, and training schedules. The work suggests that adversarial training at scale can approach the robustness levels expected from foundation models, with substantial practical impact in training efficiency and robustness to unseen attacks."

Abstract

The machine learning community has witnessed a drastic change in the training pipeline, pivoted by those ''foundation models'' with unprecedented scales. However, the field of adversarial training is lagging behind, predominantly centered around small model sizes like ResNet-50, and tiny and low-resolution datasets like CIFAR-10. To bridge this transformation gap, this paper provides a modern re-examination with adversarial training, investigating its potential benefits when applied at scale. Additionally, we introduce an efficient and effective training strategy to enable adversarial training with giant models and web-scale data at an affordable computing cost. We denote this newly introduced framework as AdvXL. Empirical results demonstrate that AdvXL establishes new state-of-the-art robust accuracy records under AutoAttack on ImageNet-1K. For example, by training on DataComp-1B dataset, our AdvXL empowers a vanilla ViT-g model to substantially surpass the previous records of $l_{\infty}$-, $l_{2}$-, and $l_{1}$-robust accuracy by margins of 11.4%, 14.2% and 12.9%, respectively. This achievement posits AdvXL as a pioneering approach, charting a new trajectory for the efficient training of robust visual representations at significantly larger scales. Our code is available at https://github.com/UCSC-VLAA/AdvXL.

Revisiting Adversarial Training at Scale

TL;DR

, and

robustness when trained on DataComp-1B with a ViT-g backbone. Key contributions include (i) a practical two-stage training protocol that cuts compute while preserving robustness, (ii) token-reduction strategies validated for efficiency, (iii) integration of CLIP embeddings to leverage web data, and (iv) comprehensive scaling studies across model size, data scale, and training schedules. The work suggests that adversarial training at scale can approach the robustness levels expected from foundation models, with substantial practical impact in training efficiency and robustness to unseen attacks."

Abstract

-, and

-robust accuracy by margins of 11.4%, 14.2% and 12.9%, respectively. This achievement posits AdvXL as a pioneering approach, charting a new trajectory for the efficient training of robust visual representations at significantly larger scales. Our code is available at https://github.com/UCSC-VLAA/AdvXL.

Paper Structure (28 sections, 3 equations, 3 figures, 5 tables)

This paper contains 28 sections, 3 equations, 3 figures, 5 tables.

Introduction
Related Work
Adversarial Training
Scaling Vision Foundation Models
AdvXL
Adversarial Training
Two-stage Training
Coarse-to-fine training.
Weak-to-strong training.
Fine-tuning.
CLIP Embedding for Web-Crawled Images
Experiment
Implementation
Dataset.
Training.
...and 13 more sections

Figures (3)

Figure 1: Our AdvXL increases significantly in terms of both model size and data scale, which brings a substantial boost over prior best results of $l_{\infty}$, $l_{2}$, and $l_{1}$ robustness on ImageNet-1K, even though our model is only trained to be $l_{\infty}$-robust.
Figure 2: Illustration of different approaches to image token reduction.
Figure 3: Illustration of leveraging CLIP embedding in adversarial training. The gray line denotes the adversarial example generation flow.

Revisiting Adversarial Training at Scale

TL;DR

Abstract

Revisiting Adversarial Training at Scale

Authors

TL;DR

Abstract

Table of Contents

Figures (3)