Table of Contents
Fetching ...

Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue Lin

TL;DR

The paper investigates using mode connectivity to analyze and improve adversarial robustness by learning high-accuracy paths between models with limited clean data. It demonstrates that such paths can repair backdoored and error-injected networks while preserving clean accuracy, outperforming standard baselines. It also reveals a robustness loss barrier on paths between regular and adversarially-trained models and links robustness to the largest input-Hessian eigenvalue, offering theoretical justification. Across CIFAR-10 and SVHN with VGG/ResNet, the approach provides practical insights for robustness evaluation and model repair, including evasion-attack implications and potential ensembling limitations.

Abstract

Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between well-trained neural networks. In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness. Our experiments cover various types of adversarial attacks applied to different network architectures and datasets. When network models are tampered with backdoor or error-injection attacks, our results demonstrate that the path connection learned using limited amount of bonafide data can effectively mitigate adversarial effects while maintaining the original accuracy on clean data. Therefore, mode connectivity provides users with the power to repair backdoored or error-injected models. We also use mode connectivity to investigate the loss landscapes of regular and robust models against evasion attacks. Experiments show that there exists a barrier in adversarial robustness loss on the path connecting regular and adversarially-trained models. A high correlation is observed between the adversarial robustness loss and the largest eigenvalue of the input Hessian matrix, for which theoretical justifications are provided. Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.

Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

TL;DR

The paper investigates using mode connectivity to analyze and improve adversarial robustness by learning high-accuracy paths between models with limited clean data. It demonstrates that such paths can repair backdoored and error-injected networks while preserving clean accuracy, outperforming standard baselines. It also reveals a robustness loss barrier on paths between regular and adversarially-trained models and links robustness to the largest input-Hessian eigenvalue, offering theoretical justification. Across CIFAR-10 and SVHN with VGG/ResNet, the approach provides practical insights for robustness evaluation and model repair, including evasion-attack implications and potential ensembling limitations.

Abstract

Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between well-trained neural networks. In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness. Our experiments cover various types of adversarial attacks applied to different network architectures and datasets. When network models are tampered with backdoor or error-injection attacks, our results demonstrate that the path connection learned using limited amount of bonafide data can effectively mitigate adversarial effects while maintaining the original accuracy on clean data. Therefore, mode connectivity provides users with the power to repair backdoored or error-injected models. We also use mode connectivity to investigate the loss landscapes of regular and robust models against evasion attacks. Experiments show that there exists a barrier in adversarial robustness loss on the path connecting regular and adversarially-trained models. A high correlation is observed between the adversarial robustness loss and the largest eigenvalue of the input Hessian matrix, for which theoretical justifications are provided. Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.

Paper Structure

This paper contains 38 sections, 1 theorem, 16 equations, 23 figures, 8 tables.

Key Result

Lemma 1

Given assumption (a), for any vector norm $\|\cdot\|$, for any data sample $x$,

Figures (23)

  • Figure 1: Loss and error rate on the path connecting two untampered VGG models trained on CIFAR-10. The path connection is trained using different settings as indicated by the curve colors. The results using SVHN and ResNet are given in Appendix \ref{['appen_untampered']}. The inference results on test set are evaluated using 5000 samples, which are separate from what are used for path connection.
  • Figure 2: Error rate against backdoor attacks on the connection path for CIFAR-10 (VGG). The error rate of clean/backdoored samples means the standard-test-error/attack-failure-rate, respectively.
  • Figure 3: Error rate against error-injection attack on the connection path for CIFAR-10 (VGG). The error rate of clean/targeted samples means standard-test-error/attack-failure-rate, respectively.
  • Figure 4: Loss, error rate, attack success rate and largest eigenvalue of input Hessian on the path connecting different model pairs on CIFAR-10 (VGG) using standard loss. The error rate of training/test data means standard training/test error, respectively. In all cases, there is no standard loss barrier but a robustness loss barrier. There is also a high correlation between the robustness loss and the largest eigenvalue of input Hessian, and their Pearson correlation coefficient (PCC) is reported in the title.
  • Figure A1: Loss and error rate on the path connecting two untampered ResNet models trained on SVHN. The path connection is trained using different settings as indicated by the curve colors. The inference results on test set are evaluated using 5000 samples, which are separate from what are used for path connection.
  • ...and 18 more figures

Theorems & Definitions (2)

  • Lemma 1
  • proof