Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

Anna Bair; Hongxu Yin; Maying Shen; Pavlo Molchanov; Jose Alvarez

Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

Anna Bair, Hongxu Yin, Maying Shen, Pavlo Molchanov, Jose Alvarez

TL;DR

AdaSAP tackles the challenge of obtaining compact neural networks that remain robust to unseen input variations. It introduces a three-step framework that leverages adaptive weight perturbations to push pruned candidates into flat loss regions, performs structured channel pruning, and then enforces uniform flatness to encourage robustness. By tying pruning and robustness to the same sharpness objective, AdaSAP achieves superior robustness on ImageNet-C and ImageNet-V2, as well as improved mAP under corrupted conditions for object detection, across multiple architectures and pruning regimes. The method is pruning-agnostic and demonstrates that sharpness-based optimization can bridge efficiency and generalization, with practical implications for deploying robust sparse models in real-world settings. The authors also provide thorough ablations and analysis of sharpness, explaining why flatter minima support both pruning pliancy and robustness improvements.

Abstract

Robustness and compactness are two essential attributes of deep learning models that are deployed in the real world. The goals of robustness and compactness may seem to be at odds, since robustness requires generalization across domains, while the process of compression exploits specificity in one domain. We introduce Adaptive Sharpness-Aware Pruning (AdaSAP), which unifies these goals through the lens of network sharpness. The AdaSAP method produces sparse networks that are robust to input variations which are unseen at training time. We achieve this by strategically incorporating weight perturbations in order to optimize the loss landscape. This allows the model to be both primed for pruning and regularized for improved robustness. AdaSAP improves the robust accuracy of pruned models on image classification by up to +6% on ImageNet C and +4% on ImageNet V2, and on object detection by +4% on a corrupted Pascal VOC dataset, over a wide range of compression ratios, pruning criteria, and network architectures, outperforming recent pruning art by large margins.

Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

TL;DR

Abstract

Paper Structure (32 sections, 6 equations, 8 figures, 17 tables, 2 algorithms)

This paper contains 32 sections, 6 equations, 8 figures, 17 tables, 2 algorithms.

Introduction
Related Works
The AdaSAP Method
Adaptive Weight Perturbation
Neuron Removal
Robustness Encouragement
Final Comments
Experiments & Results
Experiment Details
ImageNet Classification
Object Detection
Ablations
Sharpness Analysis
Limitations
Conclusion
...and 17 more sections

Figures (8)

Figure 1: Robustness of pruned models trained on ImageNet-1K drastically degrades on ImageNet-C as pruning ratio increases for many SOTA pruning methods. AdaSAP reduces the degradation in robust performance relative to standard validation performance. We approach the grey dashed line, which indicates an ideal scenario in which robust performance does not degrade at higher rates than validation performance.
Figure 2: AdaSAP is a three step process that takes as input a dense pretrained model and outputs a sparse robust model. The process can be used with any pruning method.
Figure 3: (Left) Before pruning, encourage neurons that will be pruned to lie within a flat minimum, since their removal will affect the loss less. (Right) After pruning, promote robustness by encouraging flatness across the network.
Figure 4: ImageNet C Dataset. Performance difference on various ImageNet C corruption types on models of varying sparsity. Accuracy improvement is the Top1 accuracy on a ResNet50 model trained and pruned using AdaSAP minus that of a Taylor pruned model.
Figure 5: Pascal VOC-C dataset. Performance difference on various ImageNet C corruption types on models of varying sparsity. mAP improvement is the mAP of a ResNet50 model trained and pruned using AdaSAP minus that of a HALP pruned model.
...and 3 more figures

Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

TL;DR

Abstract

Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (8)