Table of Contents
Fetching ...

Adversarial Vulnerability as a Consequence of On-Manifold Inseparibility

Rajdeep Haldar, Yue Xing, Qifan Song, Guang Lin

TL;DR

This work considers classification tasks and characterize the data distribution as a low-dimensional manifold, with high/low variance features defining the on/off manifold direction, and argues that clean training experiences poor convergence in the off-manifold direction caused by the ill-conditioning in widely used first-order optimizers like gradient descent.

Abstract

Recent works have shown theoretically and empirically that redundant data dimensions are a source of adversarial vulnerability. However, the inverse doesn't seem to hold in practice; employing dimension-reduction techniques doesn't exhibit robustness as expected. In this work, we consider classification tasks and characterize the data distribution as a low-dimensional manifold, with high/low variance features defining the on/off manifold direction. We argue that clean training experiences poor convergence in the off-manifold direction caused by the ill-conditioning in widely used first-order optimizers like gradient descent. The poor convergence then acts as a source of adversarial vulnerability when the dataset is inseparable in the on-manifold direction. We provide theoretical results for logistic regression and a 2-layer linear network on the considered data distribution. Furthermore, we advocate using second-order methods that are immune to ill-conditioning and lead to better robustness. We perform experiments and exhibit tremendous robustness improvements in clean training through long training and the employment of second-order methods, corroborating our framework. Additionally, we find the inclusion of batch-norm layers hinders such robustness gains. We attribute this to differing implicit biases between traditional and batch-normalized neural networks.

Adversarial Vulnerability as a Consequence of On-Manifold Inseparibility

TL;DR

This work considers classification tasks and characterize the data distribution as a low-dimensional manifold, with high/low variance features defining the on/off manifold direction, and argues that clean training experiences poor convergence in the off-manifold direction caused by the ill-conditioning in widely used first-order optimizers like gradient descent.

Abstract

Recent works have shown theoretically and empirically that redundant data dimensions are a source of adversarial vulnerability. However, the inverse doesn't seem to hold in practice; employing dimension-reduction techniques doesn't exhibit robustness as expected. In this work, we consider classification tasks and characterize the data distribution as a low-dimensional manifold, with high/low variance features defining the on/off manifold direction. We argue that clean training experiences poor convergence in the off-manifold direction caused by the ill-conditioning in widely used first-order optimizers like gradient descent. The poor convergence then acts as a source of adversarial vulnerability when the dataset is inseparable in the on-manifold direction. We provide theoretical results for logistic regression and a 2-layer linear network on the considered data distribution. Furthermore, we advocate using second-order methods that are immune to ill-conditioning and lead to better robustness. We perform experiments and exhibit tremendous robustness improvements in clean training through long training and the employment of second-order methods, corroborating our framework. Additionally, we find the inclusion of batch-norm layers hinders such robustness gains. We attribute this to differing implicit biases between traditional and batch-normalized neural networks.

Paper Structure

This paper contains 42 sections, 8 theorems, 62 equations, 6 figures, 4 tables.

Key Result

Theorem 1.1

Convergence to the optimal parameter is faster and independent of dimensionality in the on-manifold direction compared to the off-manifold direction. Furthermore, as dimensionality reduces, the convergence rate for the off-manifold direction worsens.

Figures (6)

  • Figure 1: Binary classification between birds and insects. The purple region represents overlap in the on-manifold feature, where only the off-manifold feature can distinguish between the two classes.
  • Figure 2: Optimal classifier robustly separates the ambient space into red and blue regions. The estimated decision boundary (green) is suboptimal and vulnerable even though it separates the data manifold accurately.
  • Figure 3: Neural Network estimated decision boundary for $T$ training epochs. Training and testing accuracy is 100% for all $T\geq 10^2$. Data distribution is on-manifold (a) inseparable (b) separable.
  • Figure 4: PGD $\ell_\infty$ robustness for clean MNIST (top)/FMNIST (bottom) model. Optimization Schemes (Left): First order ; (Right): Second Order.
  • Figure 5: PGD $\ell_\infty$ robustness CIFAR10 clean model.
  • ...and 1 more figures

Theorems & Definitions (17)

  • Example 1
  • Example 2
  • Theorem 1.1: Informal version of Theorem \ref{['thm: parameter convergence']}
  • Theorem 1.2: Informal version of Theorem \ref{['thm: loss convergence']}
  • Theorem 4.1: Progressive bounds
  • Theorem 4.2: Parameter Convergence
  • Theorem 4.3: Loss Convergence
  • Lemma A.1.1: Reparametrization
  • proof
  • Lemma A.2.1: Lipschitz smoothness and Strong Convexity
  • ...and 7 more