UniPTS: A Unified Framework for Proficient Post-Training Sparsity

Jingjing Xie; Yuxin Zhang; Mingbao Lin; Zhihang Lin; Liujuan Cao; Rongrong Ji

UniPTS: A Unified Framework for Proficient Post-Training Sparsity

Jingjing Xie, Yuxin Zhang, Mingbao Lin, Zhihang Lin, Liujuan Cao, Rongrong Ji

TL;DR

UniPTS addresses the challenge of maintaining high accuracy under Post-Training Sparsity with limited calibration data by jointly optimizing three facets: a base-decayed KL loss to transfer knowledge from a dense model, an evolutionary search for globally effective sparsity distributions with a reducing-regrowing mechanism to curb overfitting, and dynamic sparsity training to explore sparse structures stably. The base-decayed objective, defined by $\mathcal{L}_{DKL}$, adapts the gradient scale over training to sustain guidance from dense to sparse networks; sparsity distribution is determined via an evolutionary process that first reduces the search space to $P_e > P$, then regrows to meet the global sparsity $P$, with fitness regularized by BN statistics and input noise. Empirically, UniPTS markedly outperforms POT across datasets and architectures, notably improving ResNet-50 on ImageNet at 90% sparsity (e.g., achieving up to 68.6% top-1 accuracy versus POT’s significantly lower performance) and delivering substantial detection gains in Faster-RCNN and SSD at high sparsity. The findings highlight the value of integrating global objective alignment, robust sparsity distribution search, and dynamic sparse training to enable practical, data-efficient run-time sparsification. Overall, UniPTS offers a principled, scalable framework for proficient PTS with broad applicability to vision tasks and structured sparsity patterns.

Abstract

Post-training Sparsity (PTS) is a recently emerged avenue that chases efficient network sparsity with limited data in need. Existing PTS methods, however, undergo significant performance degradation compared with traditional methods that retrain the sparse networks via the whole dataset, especially at high sparsity ratios. In this paper, we attempt to reconcile this disparity by transposing three cardinal factors that profoundly alter the performance of conventional sparsity into the context of PTS. Our endeavors particularly comprise (1) A base-decayed sparsity objective that promotes efficient knowledge transferring from dense network to the sparse counterpart. (2) A reducing-regrowing search algorithm designed to ascertain the optimal sparsity distribution while circumventing overfitting to the small calibration set in PTS. (3) The employment of dynamic sparse training predicated on the preceding aspects, aimed at comprehensively optimizing the sparsity structure while ensuring training stability. Our proposed framework, termed UniPTS, is validated to be much superior to existing PTS methods across extensive benchmarks. As an illustration, it amplifies the performance of POT, a recently proposed recipe, from 3.9% to 68.6% when pruning ResNet-50 at 90% sparsity ratio on ImageNet. We release the code of our paper at https://github.com/xjjxmu/UniPTS.

UniPTS: A Unified Framework for Proficient Post-Training Sparsity

TL;DR

, adapts the gradient scale over training to sustain guidance from dense to sparse networks; sparsity distribution is determined via an evolutionary process that first reduces the search space to

, then regrows to meet the global sparsity

, with fitness regularized by BN statistics and input noise. Empirically, UniPTS markedly outperforms POT across datasets and architectures, notably improving ResNet-50 on ImageNet at 90% sparsity (e.g., achieving up to 68.6% top-1 accuracy versus POT’s significantly lower performance) and delivering substantial detection gains in Faster-RCNN and SSD at high sparsity. The findings highlight the value of integrating global objective alignment, robust sparsity distribution search, and dynamic sparse training to enable practical, data-efficient run-time sparsification. Overall, UniPTS offers a principled, scalable framework for proficient PTS with broad applicability to vision tasks and structured sparsity patterns.

Abstract

Paper Structure (16 sections, 6 equations, 4 figures, 7 tables, 1 algorithm)

This paper contains 16 sections, 6 equations, 4 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Post-Training Model Compression
Sparsity Distribution
Dynamic Sparsity Training
Method
Background
UniPTS
Base-Decayed Sparsity Objective
Reducing-Regrowing Sparsity Distribution
Sparsity Training
Experiments
Settings
Main Results
Ablation Study
...and 1 more sections

Figures (4)

Figure 1: Comparison between POT and our UniPTS framework. Left shows that POT uses predefined sparsity distribution to obtain a fixed sparse structure and retrains the pruned layer with the local sparsity objective. But UniPTS(Right) searches the optimal sparsity distribution and leverages the global sparsity objective and dynamic sparsity training to explore optimal sparse structures.
Figure 2: Overview about our method. Left:An overview of our method to search for sparsity distribution. We use evolutionary search to deal with the vast solution space. Right:We provide intuition for the training process. After finding optimal sparsity distribution, we use base-decayed KD loss and dynamic sparse training to retrain the pruned network.
Figure 3: Sparsity distribution obtained by different methods.
Figure 4: Performance influence of the sparse training strategy.

UniPTS: A Unified Framework for Proficient Post-Training Sparsity

TL;DR

Abstract

UniPTS: A Unified Framework for Proficient Post-Training Sparsity

Authors

TL;DR

Abstract

Table of Contents

Figures (4)