Data-Dependent Stability Analysis of Adversarial Training
Yihan Wang, Shuang Liu, Xiao-Shan Gao
TL;DR
The paper introduces data-dependent stability analysis for SGD-based adversarial training, deriving generalization bounds that incorporate the data distribution via on-average stability. For convex adversarial losses, the bound hinges on the adversarial population risk at initialization and gradient variance; for non-convex losses, an additional curvature term involving the Hessian appears under approximate Hessian Lipschitz assumptions. The bounds scale with the adversarial budget ${\epsilon}$ and remain consistent with standard training when ${\epsilon}=0$, while explicitly accounting for initialization and data-poisoning distribution shifts. Empirical results on multiple datasets validate the theory, showing how robust generalization degrades with larger budgets and how poisoning attacks influence robustness in a manner aligned with the proposed bounds.
Abstract
Stability analysis is an essential aspect of studying the generalization ability of deep learning, as it involves deriving generalization bounds for stochastic gradient descent-based training algorithms. Adversarial training is the most widely used defense against adversarial example attacks. However, previous generalization bounds for adversarial training have not included information regarding the data distribution. In this paper, we fill this gap by providing generalization bounds for stochastic gradient descent-based adversarial training that incorporate data distribution information. We utilize the concepts of on-average stability and high-order approximate Lipschitz conditions to examine how changes in data distribution and adversarial budget can affect robust generalization gaps. Our derived generalization bounds for both convex and non-convex losses are at least as good as the uniform stability-based counterparts which do not include data distribution information. Furthermore, our findings demonstrate how distribution shifts from data poisoning attacks can impact robust generalization.
