MMA Training: Direct Input Space Margin Maximization through Adversarial Training

Gavin Weiguang Ding; Yash Sharma; Kry Yik Chau Lui; Ruitong Huang

MMA Training: Direct Input Space Margin Maximization through Adversarial Training

Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, Ruitong Huang

TL;DR

The paper tackles the limitation of fixed-perturbation adversarial training by reframing robustness as maximizing per-example input-space margins $d_\theta(x,y)$. It introduces Max-Margin Adversarial (MMA) training, which directly maximizes margins up to a per-example threshold $d_{\max}$ using a cross-entropy surrogate and an approximate shortest perturbation $\delta^*$. The authors derive gradient relationships for margin maximization under smooth and non-smooth settings, propose AN-PGD to obtain $\delta^*$, and augment training with a clean loss to stabilize optimization. Empirical results on MNIST and CIFAR-10 across $\ell_\infty$ and $\ell_2$ show MMA improves robustness with reduced hyperparameter sensitivity, performing competitively with ensembles and TRADES. The work provides both theoretical insight and practical algorithms for margin-based defenses in adversarial robustness.

Abstract

We study adversarial robustness of neural networks from a margin maximization perspective, where margins are defined as the distances from inputs to a classifier's decision boundary. Our study shows that maximizing margins can be achieved by minimizing the adversarial loss on the decision boundary at the "shortest successful perturbation", demonstrating a close connection between adversarial losses and the margins. We propose Max-Margin Adversarial (MMA) training to directly maximize the margins to achieve adversarial robustness. Instead of adversarial training with a fixed $ε$, MMA offers an improvement by enabling adaptive selection of the "correct" $ε$ as the margin individually for each datapoint. In addition, we rigorously analyze adversarial training with the perspective of margin maximization, and provide an alternative interpretation for adversarial training, maximizing either a lower or an upper bound of the margins. Our experiments empirically confirm our theory and demonstrate MMA training's efficacy on the MNIST and CIFAR10 datasets w.r.t. $\ell_\infty$ and $\ell_2$ robustness. Code and models are available at https://github.com/BorealisAI/mma_training.

MMA Training: Direct Input Space Margin Maximization through Adversarial Training

TL;DR

The paper tackles the limitation of fixed-perturbation adversarial training by reframing robustness as maximizing per-example input-space margins

. It introduces Max-Margin Adversarial (MMA) training, which directly maximizes margins up to a per-example threshold

using a cross-entropy surrogate and an approximate shortest perturbation

. The authors derive gradient relationships for margin maximization under smooth and non-smooth settings, propose AN-PGD to obtain

, and augment training with a clean loss to stabilize optimization. Empirical results on MNIST and CIFAR-10 across

and

show MMA improves robustness with reduced hyperparameter sensitivity, performing competitively with ensembles and TRADES. The work provides both theoretical insight and practical algorithms for margin-based defenses in adversarial robustness.

Abstract

, MMA offers an improvement by enabling adaptive selection of the "correct"

as the margin individually for each datapoint. In addition, we rigorously analyze adversarial training with the perspective of margin maximization, and provide an alternative interpretation for adversarial training, maximizing either a lower or an upper bound of the margins. Our experiments empirically confirm our theory and demonstrate MMA training's efficacy on the MNIST and CIFAR10 datasets w.r.t.

and

robustness. Code and models are available at https://github.com/BorealisAI/mma_training.

MMA Training: Direct Input Space Margin Maximization through Adversarial Training

TL;DR

Abstract

MMA Training: Direct Input Space Margin Maximization through Adversarial Training

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (21)