Table of Contents
Fetching ...

Robust Learning with Jacobian Regularization

Judy Hoffman, Daniel A. Roberts, Sho Yaida

TL;DR

This work introduces Jacobian regularization, a technique that minimizes the Frobenius norm of the input-output Jacobian to enlarge decision margins and stabilize neural networks against input perturbations. It provides an efficient random-projection-based approximation enabling practical integration into SGD, with theoretical convergence guarantees. Empirical results on MNIST, CIFAR-10, and ImageNet show that the approach preserves clean accuracy while significantly boosting robustness to random noise and adversarial attacks, and it complements existing defenses. The method is architecture-agnostic and scalable, offering a broadly applicable tool for robust learning in real-world systems.

Abstract

Design of reliable systems must guarantee stability against input perturbations. In machine learning, such guarantee entails preventing overfitting and ensuring robustness of models against corruption of input data. In order to maximize stability, we analyze and develop a computationally efficient implementation of Jacobian regularization that increases classification margins of neural networks. The stabilizing effect of the Jacobian regularizer leads to significant improvements in robustness, as measured against both random and adversarial input perturbations, without severely degrading generalization properties on clean data.

Robust Learning with Jacobian Regularization

TL;DR

This work introduces Jacobian regularization, a technique that minimizes the Frobenius norm of the input-output Jacobian to enlarge decision margins and stabilize neural networks against input perturbations. It provides an efficient random-projection-based approximation enabling practical integration into SGD, with theoretical convergence guarantees. Empirical results on MNIST, CIFAR-10, and ImageNet show that the approach preserves clean accuracy while significantly boosting robustness to random noise and adversarial attacks, and it complements existing defenses. The method is architecture-agnostic and scalable, offering a broadly applicable tool for robust learning in real-world systems.

Abstract

Design of reliable systems must guarantee stability against input perturbations. In machine learning, such guarantee entails preventing overfitting and ensuring robustness of models against corruption of input data. In order to maximize stability, we analyze and develop a computationally efficient implementation of Jacobian regularization that increases classification margins of neural networks. The stabilizing effect of the Jacobian regularizer leads to significant improvements in robustness, as measured against both random and adversarial input perturbations, without severely degrading generalization properties on clean data.

Paper Structure

This paper contains 18 sections, 24 equations, 10 figures, 3 tables, 1 algorithm.

Figures (10)

  • Figure 1: Cross sections of decision cells in the input space. To make these cross sections for LeNet' models trained on the MNIST dataset, a test sample (black dot) and a two-dimensional hyperplane $\subset\mathbb{R}^{784}$ passing through it are randomly chosen. Different colors indicate the different classes predicted by these models, transparency and contours are set by maximum of the softmax values, and the circle around the test sample signifies distance to the closest decision boundary in the plane. (a) Decision cells are rugged without regularization. (b) Training with $L^2$ regularization leads to smoother decision cells, but does not necessarily ensure large cells. (c) Jacobian regularization pushes boundaries outwards and embiggens decision cells.
  • Figure 2: Comparison of Approximate to Exact Jacobian Regularizer. The difference between the exact method (cyan) and the random projection method with $n_{\mathrm{proj}}=1$ (blue) and $n_{\mathrm{proj}}=3$ (red orange) is negligible both in terms of accuracy (a) and the norm of the input-output Jacobian (b) on the test set for LeNet' models trained on MNIST with $\lambda_{\mathrm{JR}}=0.01$. Shading indicates the standard deviation estimated over $5$ distinct runs and dashed vertical lines signify the learning rate quenches.
  • Figure 3: Robustness against random and adversarial input perturbations. This key result illustrates that Jacobian regularization significantly increases the robustness of a learned model with LeNet' architecture trained on the MNIST dataset. (a) Considering robustness under white noise perturbations, Jacobian minimization is the most effective regularizer. (b,c) Jacobian regularization alone outperforms an adversarial training defense (base models all include $L^2$ and dropout regularization). Shades indicate standard deviations estimated over $5$ distinct runs.
  • Figure S1: Cross sections of decision cells in the input space for LeNet' models trained on the MNIST dataset along random hyperplanes. Figure specifications are same as in Figure \ref{['prism']}. (Left) No regularization. (Middle) $L^2$ regularization with $\lambda_{\mathrm{WD}}=0.0005$ . (Right) Jacobian regularization with $\lambda_{\mathrm{JR}}=0.01$.
  • Figure S2: Cross sections of decision cells in the input space for LeNet' models trained on the MNIST dataset along adversarial hyperplanes. Namely, given a test sample (black dot), the hyperplane through it is spanned by two adversarial examples identified through FGSM, one for the model trained with $L^2$ regularization $\lambda_{\mathrm{WD}}=0.0005$ and dropout rate $0.5$ but no defense (dark-grey dot; left figure) and the other for the model with the same standard regularization methods plus Jacobian regularization $\lambda_{\mathrm{JR}}=0.01$ and adversarial training (white-grey dot; right figure).
  • ...and 5 more figures