Isomorphic Pruning for Vision Models
Gongfan Fang, Xinyin Ma, Michael Bi Mi, Xinchao Wang
TL;DR
Isomorphic Pruning addresses the incompatibility of comparing heterogeneous sub-structures in modern vision models by decomposing networks into isomorphic sub-structures based on topology and pruning within each group. The method models sub-structures as graphs, uses graph isomorphism to cluster similar motifs, and applies a chosen importance criterion (e.g., Magnitude or Taylor) to rank and prune within groups, enabling reliable, architecture-agnostic pruning. Empirically, it yields competitive or superior accuracy with reduced MACs and parameters across ConvNext, ResNet, MobileNet-v2, and Vision Transformers on ImageNet-1K, while providing actionable latency and memory benefits on real hardware. The approach demonstrates the practicality of topology-aware pruning for heterogeneous vision models and offers a flexible framework compatible with multiple pruning criteria and architectures, accompanied by open-source code.
Abstract
Structured pruning reduces the computational overhead of deep neural networks by removing redundant sub-structures. However, assessing the relative importance of different sub-structures remains a significant challenge, particularly in advanced vision models featuring novel mechanisms and architectures like self-attention, depth-wise convolutions, or residual connections. These heterogeneous substructures usually exhibit diverged parameter scales, weight distributions, and computational topology, introducing considerable difficulty to importance comparison. To overcome this, we present Isomorphic Pruning, a simple approach that demonstrates effectiveness across a range of network architectures such as Vision Transformers and CNNs, and delivers competitive performance across different model sizes. Isomorphic Pruning originates from an observation that, when evaluated under a pre-defined importance criterion, heterogeneous sub-structures demonstrate significant divergence in their importance distribution, as opposed to isomorphic structures that present similar importance patterns. This inspires us to perform isolated ranking and comparison on different types of sub-structures for more reliable pruning. Our empirical results on ImageNet-1K demonstrate that Isomorphic Pruning surpasses several pruning baselines dedicatedly designed for Transformers or CNNs. For instance, we improve the accuracy of DeiT-Tiny from 74.52% to 77.50% by pruning an off-the-shelf DeiT-Base model. And for ConvNext-Tiny, we enhanced performance from 82.06% to 82.18%, while reducing the number of parameters and memory usage. Code is available at \url{https://github.com/VainF/Isomorphic-Pruning}.
