Table of Contents
Fetching ...

Comprehensive and Efficient Distillation for Lightweight Sentiment Analysis Models

Guangyu Xie, Yice Zhang, Jianzhu Bao, Qianlong Wang, Yang Sun, Bingbing Wang, Ruifeng Xu

TL;DR

This work tackles practical limitations in sentiment-analysis distillation by introducing CompEffDist, a framework that automatically constructs comprehensive instructions from sentiment attributes and applies difficulty-based data filtering to boost data efficiency. The attribute-based instruction module compiles thousands of attributes into hundreds of analytical perspectives, generating diverse tasks that cover broad analytical angles without manual labor. The difficulty-based filtering uses a ranking-based score (with optional proxy) to prioritize harder samples, enabling significant data savings. Across multiple model families (Llama-3, Qwen-3, Gemma-3), 3B student models reach near-teacher performance on most tasks and achieve the same performance with only 10% of the distillation data, highlighting substantial practical impact for lightweight sentiment-analysis systems.

Abstract

Recent efforts leverage knowledge distillation techniques to develop lightweight and practical sentiment analysis models. These methods are grounded in human-written instructions and large-scale user texts. Despite the promising results, two key challenges remain: (1) manually written instructions are limited in diversity and quantity, making them insufficient to ensure comprehensive coverage of distilled knowledge; (2) large-scale user texts incur high computational cost, hindering the practicality of these methods. To this end, we introduce CompEffDist, a comprehensive and efficient distillation framework for sentiment analysis. Our framework consists of two key modules: attribute-based automatic instruction construction and difficulty-based data filtering, which correspondingly tackle the aforementioned challenges. Applying our method across multiple model series (Llama-3, Qwen-3, and Gemma-3), we enable 3B student models to match the performance of 20x larger teacher models on most tasks. In addition, our approach greatly outperforms baseline methods in data efficiency, attaining the same performance level with only 10% of the data.

Comprehensive and Efficient Distillation for Lightweight Sentiment Analysis Models

TL;DR

This work tackles practical limitations in sentiment-analysis distillation by introducing CompEffDist, a framework that automatically constructs comprehensive instructions from sentiment attributes and applies difficulty-based data filtering to boost data efficiency. The attribute-based instruction module compiles thousands of attributes into hundreds of analytical perspectives, generating diverse tasks that cover broad analytical angles without manual labor. The difficulty-based filtering uses a ranking-based score (with optional proxy) to prioritize harder samples, enabling significant data savings. Across multiple model families (Llama-3, Qwen-3, Gemma-3), 3B student models reach near-teacher performance on most tasks and achieve the same performance with only 10% of the distillation data, highlighting substantial practical impact for lightweight sentiment-analysis systems.

Abstract

Recent efforts leverage knowledge distillation techniques to develop lightweight and practical sentiment analysis models. These methods are grounded in human-written instructions and large-scale user texts. Despite the promising results, two key challenges remain: (1) manually written instructions are limited in diversity and quantity, making them insufficient to ensure comprehensive coverage of distilled knowledge; (2) large-scale user texts incur high computational cost, hindering the practicality of these methods. To this end, we introduce CompEffDist, a comprehensive and efficient distillation framework for sentiment analysis. Our framework consists of two key modules: attribute-based automatic instruction construction and difficulty-based data filtering, which correspondingly tackle the aforementioned challenges. Applying our method across multiple model series (Llama-3, Qwen-3, and Gemma-3), we enable 3B student models to match the performance of 20x larger teacher models on most tasks. In addition, our approach greatly outperforms baseline methods in data efficiency, attaining the same performance level with only 10% of the data.

Paper Structure

This paper contains 24 sections, 9 equations, 9 figures, 18 tables.

Figures (9)

  • Figure 1: Illustration of our approach: (1) we extract sentiment knowledge from the teacher model through instructions and user texts and then utilize it to optimize the student model; (2) we generate diverse instructions based on various analytical perspectives to ensure comprehensive distillation; (3) we assess the difficulty of instructions and user texts and then reduce the proportion of simple samples to ensure efficient distillation.
  • Figure 2: Illustration of attribute-based automatic instruction construction.
  • Figure 3: Generated instruction for tone profiling.
  • Figure 4: Visualization of the representative analytical perspectives. A more complete visualization is provided in Figure \ref{['fig:t-SNE']} of Appendix \ref{['app:b']}.
  • Figure 5: Type distribution of generated tasks.
  • ...and 4 more figures