Learning Intrinsic Dimension via Information Bottleneck for Explainable Aspect-based Sentiment Analysis
Zhenxiao Cheng, Jie Zhou, Wen Wu, Qin Chen, Liang He
TL;DR
This paper addresses the uneven importance of embedding dimensions in Aspect-based Sentiment Analysis by introducing IBG, a gradient-based explanation framework that learns a compact intrinsic dimension via an Information Bottleneck. It introduces the iBiL module to compress word embeddings to a low-dimensional intrinsic space while keeping the original embeddings fixed, and combines gradient signals from both spaces to produce sentiment-aware explanations. Extensive experiments on four ABSA datasets show that IBG improves both predictive performance and interpretability over strong baselines and standard explanation methods; ablation studies confirm the contributions of the iBiL and IB components, and analyses identify effective intrinsic sizes around 10–20 dimensions. The framework is model-agnostic and demonstrates potential for broader application to ABSA tasks and future work on large-scale language models, offering practical benefits for explainability and efficiency in NLP systems.
Abstract
Gradient-based explanation methods are increasingly used to interpret neural models in natural language processing (NLP) due to their high fidelity. Such methods determine word-level importance using dimension-level gradient values through a norm function, often presuming equal significance for all gradient dimensions. However, in the context of Aspect-based Sentiment Analysis (ABSA), our preliminary research suggests that only specific dimensions are pertinent. To address this, we propose the Information Bottleneck-based Gradient (\texttt{IBG}) explanation framework for ABSA. This framework leverages an information bottleneck to refine word embeddings into a concise intrinsic dimension, maintaining essential features and omitting unrelated information. Comprehensive tests show that our \texttt{IBG} approach considerably improves both the models' performance and interpretability by identifying sentiment-aware features.
