Spatial Re-parameterization for N:M Sparsity
Yuxin Zhang, Mingbao Lin, Mingliang Xu, Yonghong Tian, Rongrong Ji
TL;DR
This work addresses the performance gap between structured N:M sparsity and unstructured sparsity by revealing a key property: unstructured sparsity exhibits spatial sparsity variability that benefits feature localization. The authors propose Spatial Re-parameterization (SpRe), which adds a train-time auxiliary branch that follows the unstructured spatial sparsity distribution and can be re-parameterized into the main N:M branch for inference, preserving the N:M pattern and incurring no extra inference cost. Through BN-enabled multi-branch optimization, SpRe maintains spatial sparsity diversity during training and yields consistent accuracy gains across CIFAR-10, ImageNet-1K, and COCO tasks, even matching or surpassing some unstructured sparsity methods at comparable sparsity levels. The approach is modular and orthogonal to existing N:M methods, offering a practical path to closing the performance gap while preserving hardware-friendly acceleration on N:M sparse tensor cores.
Abstract
This paper presents a Spatial Re-parameterization (SpRe) method for the N:M sparsity. SpRe stems from an observation regarding the restricted variety in spatial sparsity of convolution kernels presented in N:M sparsity compared with unstructured sparsity. Particularly, N:M sparsity exhibits a fixed sparsity rate within the spatial domains due to its distinctive pattern that mandates N non-zero components among M successive weights in the input channel dimension of convolution filters. On the contrary, we observe that conventional unstructured sparsity displays a substantial divergence in sparsity across the spatial domains, which we experimentally verify to be very crucial for its robust performance retention compared with N:M sparsity. Therefore, SpRe employs the spatial-sparsity distribution of unstructured sparsity by assigning an extra branch in conjunction with the original N:M branch at training time, which allows the N:M sparse network to sustain a similar distribution of spatial sparsity with unstructured sparsity. During inference, the extra branch can be further re-parameterized into the main N:M branch, without exerting any distortion on the sparse pattern or additional computation costs. SpRe has achieved a commendable feat by matching the performance of N:M sparsity methods with state-of-the-art unstructured sparsity methods across various benchmarks. Our project is available at https://github.com/zyxxmu/SpRE.
