Enhancing the Comprehensibility of Text Explanations via Unsupervised Concept Discovery
Yifan Sun, Danding Wang, Qiang Sheng, Juan Cao, Jintao Li
TL;DR
The paper tackles the lack of human-aligned interpretability in text explanations by introducing ECO-Concept, an intrinsically interpretable framework that automatically discovers comprehensible concepts without concept annotations using a slot-attention-based extractor. It integrates LLM-based comprehensibility evaluation as a feedback signal to refine concept representations, balancing task discriminativity with human interpretability. Empirical results across seven datasets show ECO-Concept achieves competitive or superior performance relative to supervised and unsupervised baselines while delivering more comprehensible concepts, as supported by both quantitative metrics and human studies. This approach offers a practical path toward trustworthy, explanation-rich NLP models without the need for extensive concept annotations.
Abstract
Concept-based explainable approaches have emerged as a promising method in explainable AI because they can interpret models in a way that aligns with human reasoning. However, their adaption in the text domain remains limited. Most existing methods rely on predefined concept annotations and cannot discover unseen concepts, while other methods that extract concepts without supervision often produce explanations that are not intuitively comprehensible to humans, potentially diminishing user trust. These methods fall short of discovering comprehensible concepts automatically. To address this issue, we propose \textbf{ECO-Concept}, an intrinsically interpretable framework to discover comprehensible concepts with no concept annotations. ECO-Concept first utilizes an object-centric architecture to extract semantic concepts automatically. Then the comprehensibility of the extracted concepts is evaluated by large language models. Finally, the evaluation result guides the subsequent model fine-tuning to obtain more understandable explanations. Experiments show that our method achieves superior performance across diverse tasks. Further concept evaluations validate that the concepts learned by ECO-Concept surpassed current counterparts in comprehensibility.
