Why is "Problems" Predictive of Positive Sentiment? A Case Study of Explaining Unintuitive Features in Sentiment Classification
Jiaming Qu, Jaime Arguello, Yue Wang
TL;DR
This paper tackles the challenge that some predictive input features in sentiment classification appear unintuitive to humans. It combines an LLM-based zero-shot estimator with three explanation tools—data distribution, training examples, and contextual patterns—to detect and explain unintuitive features. Through a two-phase crowdsourced study (N=300) across product categories, the authors show that while single tools can aid objective judgments, the best understanding and user experience emerge from combining tools that link predictive features to training data and contextual usage. The work advances practical guidance for designing XAI explanations that not only identify predictive features but also illuminate why they are predictive, potentially increasing trust and learning in real-world tasks.
Abstract
Explainable AI (XAI) algorithms aim to help users understand how a machine learning model makes predictions. To this end, many approaches explain which input features are most predictive of a target label. However, such explanations can still be puzzling to users (e.g., in product reviews, the word "problems" is predictive of positive sentiment). If left unexplained, puzzling explanations can have negative impacts. Explaining unintuitive associations between an input feature and a target label is an underexplored area in XAI research. We take an initial effort in this direction using unintuitive associations learned by sentiment classifiers as a case study. We propose approaches for (1) automatically detecting associations that can appear unintuitive to users and (2) generating explanations to help users understand why an unintuitive feature is predictive. Results from a crowdsourced study (N=300) found that our proposed approaches can effectively detect and explain predictive but unintuitive features in sentiment classification.
