Table of Contents
Fetching ...

SPIN: Sparsifying and Integrating Internal Neurons in Large Language Models for Text Classification

Difan Jiao, Yilun Liu, Zhenwei Tang, Daniel Matter, Jürgen Pfeffer, Ashton Anderson

TL;DR

SPIN targets the limitation of relying solely on terminal hidden states for text classification by harnessing internal representations from intermediate layers. It sparsifies per-layer neurons using linear probing to identify salient units and then integrates these salient features across layers to form rich multi-grained inputs for a lightweight classification head, all without updating the LLM weights. The approach is model-agnostic and demonstrates improvements in accuracy, training and inference efficiency, and interpretability, including compatibility with parameter-efficient fine-tuning. Empirical results across IMDb, SST-2, and EDOS show consistent gains over baselines, with competitive performance relative to fully fine-tuned models and clear advantages in speed and explainability, making SPIN a practical alternative for task-specific text classification with large language models.

Abstract

Among the many tasks that Large Language Models (LLMs) have revolutionized is text classification. Current text classification paradigms, however, rely solely on the output of the final layer in the LLM, with the rich information contained in internal neurons largely untapped. In this study, we present SPIN: a model-agnostic framework that sparsifies and integrates internal neurons of intermediate layers of LLMs for text classification. Specifically, SPIN sparsifies internal neurons by linear probing-based salient neuron selection layer by layer, avoiding noise from unrelated neurons and ensuring efficiency. The cross-layer salient neurons are then integrated to serve as multi-layered features for the classification head. Extensive experimental results show our proposed SPIN significantly improves text classification accuracy, efficiency, and interpretability.

SPIN: Sparsifying and Integrating Internal Neurons in Large Language Models for Text Classification

TL;DR

SPIN targets the limitation of relying solely on terminal hidden states for text classification by harnessing internal representations from intermediate layers. It sparsifies per-layer neurons using linear probing to identify salient units and then integrates these salient features across layers to form rich multi-grained inputs for a lightweight classification head, all without updating the LLM weights. The approach is model-agnostic and demonstrates improvements in accuracy, training and inference efficiency, and interpretability, including compatibility with parameter-efficient fine-tuning. Empirical results across IMDb, SST-2, and EDOS show consistent gains over baselines, with competitive performance relative to fully fine-tuned models and clear advantages in speed and explainability, making SPIN a practical alternative for task-specific text classification with large language models.

Abstract

Among the many tasks that Large Language Models (LLMs) have revolutionized is text classification. Current text classification paradigms, however, rely solely on the output of the final layer in the LLM, with the rich information contained in internal neurons largely untapped. In this study, we present SPIN: a model-agnostic framework that sparsifies and integrates internal neurons of intermediate layers of LLMs for text classification. Specifically, SPIN sparsifies internal neurons by linear probing-based salient neuron selection layer by layer, avoiding noise from unrelated neurons and ensuring efficiency. The cross-layer salient neurons are then integrated to serve as multi-layered features for the classification head. Extensive experimental results show our proposed SPIN significantly improves text classification accuracy, efficiency, and interpretability.
Paper Structure (49 sections, 12 equations, 7 figures, 12 tables)

This paper contains 49 sections, 12 equations, 7 figures, 12 tables.

Figures (7)

  • Figure 1: Overview of (a) baseline method that only uses the terminal hidden states; (b) SPIN that uses sparsified and integrated internal neurons from each intermediate layers to feed the classification head.
  • Figure 2: Floating point operations cost for training of baseline, SPIN, and full fine-tuning on different models. The cost of SPIN is estimated on FFN activations with $\eta=0.5$, and the cost of fine-tuning is estimated based on the lowest demand assumption of 1 epoch.
  • Figure 3: Activation probability distributions for individual salient neurons and integrated classifier at different layers of GPT2-XL. (Top) Distributions of SPIN with integrating neurons up to the specified layer, along with accuracy scores in text classification. (Bottom) Distributions of the most salient neurons according to their importance attributed by layer-wise neuron selection. Red regions indicate predictions for negative samples, and green regions for positive ones.
  • Figure 4: (Top) SPIN performance as a function of sparsification threshold $\eta$. (Bottom) The percentage of selected salient neurons with each $\eta$.
  • Figure 5: The number of trainable parameters for baseline LLM backbone and SPIN.
  • ...and 2 more figures