Table of Contents
Fetching ...

PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models

Zining Wnag, Jinyang Guo, Ruihao Gong, Yang Yong, Aishan Liu, Yushi Huang, Jiaheng Liu, Xianglong Liu

TL;DR

PTSBench tackles the need for a comprehensive evaluation of post-training sparsity by systematically benchmarking both sparsity algorithms and model sparsification capabilities. It introduces five tracks and an open-source framework to study how sparsity allocation and reconstruction interact with diverse architectures and tasks. The study reveals that learning-based sparsity allocation and block-wise reconstruction yield strong performance, that attention-based models exhibit higher sparsity potential, and that generation tasks require dedicated sparsity methods. The work provides actionable guidance for designing sparse deployment-friendly models and offers a reproducible platform for future PTS research.

Abstract

With the increased attention to model efficiency, post-training sparsity (PTS) has become more and more prevalent because of its effectiveness and efficiency. However, there remain questions on better practice of PTS algorithms and the sparsification ability of models, which hinders the further development of this area. Therefore, a benchmark to comprehensively investigate the issues above is urgently needed. In this paper, we propose the first comprehensive post-training sparsity benchmark called PTSBench towards algorithms and models. We benchmark 10+ PTS general-pluggable fine-grained techniques on 3 typical tasks using over 40 off-the-shelf model architectures. Through extensive experiments and analyses, we obtain valuable conclusions and provide several insights from both algorithms and model aspects. Our PTSBench can provide (1) new observations for a better understanding of the PTS algorithms, (2) in-depth and comprehensive evaluations for the sparsification ability of models, and (3) a well-structured and easy-integrate open-source framework. We hope this work will provide illuminating conclusions and advice for future studies of post-training sparsity methods and sparsification-friendly model design. The code for our PTSBench is released at \href{https://github.com/ModelTC/msbench}{https://github.com/ModelTC/msbench}.

PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models

TL;DR

PTSBench tackles the need for a comprehensive evaluation of post-training sparsity by systematically benchmarking both sparsity algorithms and model sparsification capabilities. It introduces five tracks and an open-source framework to study how sparsity allocation and reconstruction interact with diverse architectures and tasks. The study reveals that learning-based sparsity allocation and block-wise reconstruction yield strong performance, that attention-based models exhibit higher sparsity potential, and that generation tasks require dedicated sparsity methods. The work provides actionable guidance for designing sparse deployment-friendly models and offers a reproducible platform for future PTS research.

Abstract

With the increased attention to model efficiency, post-training sparsity (PTS) has become more and more prevalent because of its effectiveness and efficiency. However, there remain questions on better practice of PTS algorithms and the sparsification ability of models, which hinders the further development of this area. Therefore, a benchmark to comprehensively investigate the issues above is urgently needed. In this paper, we propose the first comprehensive post-training sparsity benchmark called PTSBench towards algorithms and models. We benchmark 10+ PTS general-pluggable fine-grained techniques on 3 typical tasks using over 40 off-the-shelf model architectures. Through extensive experiments and analyses, we obtain valuable conclusions and provide several insights from both algorithms and model aspects. Our PTSBench can provide (1) new observations for a better understanding of the PTS algorithms, (2) in-depth and comprehensive evaluations for the sparsification ability of models, and (3) a well-structured and easy-integrate open-source framework. We hope this work will provide illuminating conclusions and advice for future studies of post-training sparsity methods and sparsification-friendly model design. The code for our PTSBench is released at \href{https://github.com/ModelTC/msbench}{https://github.com/ModelTC/msbench}.

Paper Structure

This paper contains 22 sections, 7 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: The illustration of the overall Post-Training Sparsity pipeline, which is employed by most PTS methods.
  • Figure 2: Evaluation tracks of PTSBench. We benchmark the performance of PTS fine-grained algorithms and model sparsification abilities on a range of comprehensive evaluation tracks, including: "Sparsity Allocation", "Reconstruction", "Neural Architectures", "Model Size Robustness", and "Different Tasks". We illustrate an overview of the results of each track respectively on the right of the figure.
  • Figure 3: Visualization of sparsity allocation of ResNet-32 at a sparsity rate of 90% on CIFAR-100.
  • Figure 4: Mean relative accuracy loss of different model sizes.
  • Figure 5: The overall evaluation results of PTSBench.