Table of Contents
Fetching ...

Revisiting Large Language Model Pruning using Neuron Semantic Attribution

Yizhuo Ding, Xinwei Sun, Yanwei Fu, Guosheng Hu

TL;DR

The paper investigates the generalizability of large language model pruning across diverse tasks and datasets, revealing that calibration data largely shapes pruning outcomes and that sentiment classification can suffer substantial drops under common pruning regimes. It evaluates three post-training pruning methods—SparseGPT, Wanda, and RIA—across 14 models, 24 datasets, and four task categories, using accuracy as the primary metric and examining factors like sparsity and sequence length. To explain pruning-induced performance changes, the authors introduce Neuron Semantic Attribution (NSA), a framework that links neuron activations to influential input semantics via a three-step process (influential word selection, neuron–word matching, and unpruned-vs-pruned comparison) and demonstrate NSA visualizations on Yelp and ARC-C data. The results emphasize task- and data-dependent effects, show that calibration data can dramatically alter pruning efficacy, and provide actionable insights for designing more robust, interpretable pruning methods and calibration-data strategies with practical implications for deploying compressed LLMs. Overall, the work advances both the empirical understanding of pruning generalization and the interpretability of pruning decisions through neuron–semantics mappings.

Abstract

Model pruning technique is vital for accelerating large language models by reducing their size and computational requirements. However, the generalizability of existing pruning methods across diverse datasets and tasks remains unclear. Thus, we conduct extensive evaluations on 24 datasets and 4 tasks using popular pruning methods. Based on these evaluations, we find and then investigate that calibration set greatly affect the performance of pruning methods. In addition, we surprisingly find a significant performance drop of existing pruning methods in sentiment classification tasks. To understand the link between performance drop and pruned neurons, we propose Neuron Semantic Attribution, which learns to associate each neuron with specific semantics. This method first makes the unpruned neurons of LLMs explainable.

Revisiting Large Language Model Pruning using Neuron Semantic Attribution

TL;DR

The paper investigates the generalizability of large language model pruning across diverse tasks and datasets, revealing that calibration data largely shapes pruning outcomes and that sentiment classification can suffer substantial drops under common pruning regimes. It evaluates three post-training pruning methods—SparseGPT, Wanda, and RIA—across 14 models, 24 datasets, and four task categories, using accuracy as the primary metric and examining factors like sparsity and sequence length. To explain pruning-induced performance changes, the authors introduce Neuron Semantic Attribution (NSA), a framework that links neuron activations to influential input semantics via a three-step process (influential word selection, neuron–word matching, and unpruned-vs-pruned comparison) and demonstrate NSA visualizations on Yelp and ARC-C data. The results emphasize task- and data-dependent effects, show that calibration data can dramatically alter pruning efficacy, and provide actionable insights for designing more robust, interpretable pruning methods and calibration-data strategies with practical implications for deploying compressed LLMs. Overall, the work advances both the empirical understanding of pruning generalization and the interpretability of pruning decisions through neuron–semantics mappings.

Abstract

Model pruning technique is vital for accelerating large language models by reducing their size and computational requirements. However, the generalizability of existing pruning methods across diverse datasets and tasks remains unclear. Thus, we conduct extensive evaluations on 24 datasets and 4 tasks using popular pruning methods. Based on these evaluations, we find and then investigate that calibration set greatly affect the performance of pruning methods. In addition, we surprisingly find a significant performance drop of existing pruning methods in sentiment classification tasks. To understand the link between performance drop and pruned neurons, we propose Neuron Semantic Attribution, which learns to associate each neuron with specific semantics. This method first makes the unpruned neurons of LLMs explainable.

Paper Structure

This paper contains 25 sections, 1 equation, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Application of the NSA method on Yelp data, showing the activations of neuron 2194 in layer 15. Darker colors represent stronger activations. For example, the activation of the pruned and unpruned models for the sentiment-related semantics 'badly' drops from 0.2112 to 0.0084, explaining the degraded sentiment classification of the pruned model.
  • Figure 2: The framework of NSA. Step 1, Influential Words Selection; Step 2, Neuron Matching; Step 3, Comparison.
  • Figure 3: The average accuracy of pruned model with different sequence length of tokens.
  • Figure 4: The accuracy of three pruning methods using 9 different calibration data averaged over 24 tasks.
  • Figure 5: Accuracy of pruned models on different calibration data with 2:4 sparsity averaged over 4 kinds of tasks.
  • ...and 2 more figures