Robust Learning with Jacobian Regularization
Judy Hoffman, Daniel A. Roberts, Sho Yaida
TL;DR
This work introduces Jacobian regularization, a technique that minimizes the Frobenius norm of the input-output Jacobian to enlarge decision margins and stabilize neural networks against input perturbations. It provides an efficient random-projection-based approximation enabling practical integration into SGD, with theoretical convergence guarantees. Empirical results on MNIST, CIFAR-10, and ImageNet show that the approach preserves clean accuracy while significantly boosting robustness to random noise and adversarial attacks, and it complements existing defenses. The method is architecture-agnostic and scalable, offering a broadly applicable tool for robust learning in real-world systems.
Abstract
Design of reliable systems must guarantee stability against input perturbations. In machine learning, such guarantee entails preventing overfitting and ensuring robustness of models against corruption of input data. In order to maximize stability, we analyze and develop a computationally efficient implementation of Jacobian regularization that increases classification margins of neural networks. The stabilizing effect of the Jacobian regularizer leads to significant improvements in robustness, as measured against both random and adversarial input perturbations, without severely degrading generalization properties on clean data.
