Adaptive Sharpness-Aware Pruning for Robust Sparse Networks
Anna Bair, Hongxu Yin, Maying Shen, Pavlo Molchanov, Jose Alvarez
TL;DR
AdaSAP tackles the challenge of obtaining compact neural networks that remain robust to unseen input variations. It introduces a three-step framework that leverages adaptive weight perturbations to push pruned candidates into flat loss regions, performs structured channel pruning, and then enforces uniform flatness to encourage robustness. By tying pruning and robustness to the same sharpness objective, AdaSAP achieves superior robustness on ImageNet-C and ImageNet-V2, as well as improved mAP under corrupted conditions for object detection, across multiple architectures and pruning regimes. The method is pruning-agnostic and demonstrates that sharpness-based optimization can bridge efficiency and generalization, with practical implications for deploying robust sparse models in real-world settings. The authors also provide thorough ablations and analysis of sharpness, explaining why flatter minima support both pruning pliancy and robustness improvements.
Abstract
Robustness and compactness are two essential attributes of deep learning models that are deployed in the real world. The goals of robustness and compactness may seem to be at odds, since robustness requires generalization across domains, while the process of compression exploits specificity in one domain. We introduce Adaptive Sharpness-Aware Pruning (AdaSAP), which unifies these goals through the lens of network sharpness. The AdaSAP method produces sparse networks that are robust to input variations which are unseen at training time. We achieve this by strategically incorporating weight perturbations in order to optimize the loss landscape. This allows the model to be both primed for pruning and regularized for improved robustness. AdaSAP improves the robust accuracy of pruned models on image classification by up to +6% on ImageNet C and +4% on ImageNet V2, and on object detection by +4% on a corrupted Pascal VOC dataset, over a wide range of compression ratios, pruning criteria, and network architectures, outperforming recent pruning art by large margins.
