Fine-grained Sentiment Analysis with Faithful Attention
Ruiqi Zhong, Steven Shao, Kathleen McKeown
TL;DR
This work tackles targeted sentiment analysis by predicting sentiment relations between a specified source and target, leveraging supervised attention guided by human rationales. The method augments a state-of-the-art relation extractor with a KL-divergence attention loss, $\mathcal{L} = \mathcal{L}_{clf} + \lambda_{attn} \mathcal{L}_{attn}$, where $\mathcal{L}_{attn} = KL(A || \hat{A})$, and optionally a rationale-prediction term $\mathcal{L}_{r}$; a variant with limited rationales is explored. Across MPQA2.0 and GFBF, the trained-attention approach yields 4–8 point improvements over untrained baselines and outperforms a rationale-based multi-task baseline, with only a small fraction of rationales needed to achieve substantial gains. The paper also introduces probes-needed and mass-needed as faithfulness metrics and uses crowd-sourced plausibility tests, revealing that trained attention can be more plausible than untrained attention, though faithfulness is dataset-dependent. The findings suggest that integrating concise human rationales can repair attention in low-data regimes (notably GFBF) and provide meaningful, human-aligned explanations in sentiment-relational tasks, with practical implications for interpretable relation extraction.
Abstract
While the general task of textual sentiment classification has been widely studied, much less research looks specifically at sentiment between a specified source and target. To tackle this problem, we experimented with a state-of-the-art relation extraction model. Surprisingly, we found that despite reasonable performance, the model's attention was often systematically misaligned with the words that contribute to sentiment. Thus, we directly trained the model's attention with human rationales and improved our model performance by a robust 4~8 points on all tasks we defined on our data sets. We also present a rigorous analysis of the model's attention, both trained and untrained, using novel and intuitive metrics. Our results show that untrained attention does not provide faithful explanations; however, trained attention with concisely annotated human rationales not only increases performance, but also brings faithful explanations. Encouragingly, a small amount of annotated human rationales suffice to correct the attention in our task.
