TART: Boosting Clean Accuracy Through Tangent Direction Guided Adversarial Training
Bongsoo Yi, Rongjie Lai, Yao Li
TL;DR
This work addresses the clean accuracy drop often observed in adversarial training by introducing Tangent Direction Guided Adversarial Training (TART), which uses the data manifold's tangent space to adapt per-sample perturbations. TART estimates tangent spaces via an offline autoencoder–PCA pipeline, computes the tangential component of adversarial perturbations using the projection $\\Pi_A = A(A^T A)^{-1}A^T$, and selects an adaptive bound $\\epsilon_i$ based on whether the tangential component is above the batch median. The key contributions are (i) tangent-space estimation for each input, (ii) a per-sample perturbation strategy that emphasizes tangential over normal components, and (iii) theoretical justification coupled with empirical validation showing improved clean accuracy with preserved robustness across simulated and CIFAR-10 experiments, and compatibility with standard AT, TRADES, MART, and GAIRAT. The results suggest that incorporating geometric properties of data can enhance adversarial training efficiency and effectiveness in practical settings.
Abstract
Adversarial training has been shown to be successful in enhancing the robustness of deep neural networks against adversarial attacks. However, this robustness is accompanied by a significant decline in accuracy on clean data. In this paper, we propose a novel method, called Tangent Direction Guided Adversarial Training (TART), that leverages the tangent space of the data manifold to ameliorate the existing adversarial defense algorithms. We argue that training with adversarial examples having large normal components significantly alters the decision boundary and hurts accuracy. TART mitigates this issue by estimating the tangent direction of adversarial examples and allocating an adaptive perturbation limit according to the norm of their tangential component. To the best of our knowledge, our paper is the first work to consider the concept of tangent space and direction in the context of adversarial defense. We validate the effectiveness of TART through extensive experiments on both simulated and benchmark datasets. The results demonstrate that TART consistently boosts clean accuracy while retaining a high level of robustness against adversarial attacks. Our findings suggest that incorporating the geometric properties of data can lead to more effective and efficient adversarial training methods.
