Exploring the Reliability of Self-explanation and its Relationship with Classification in Language Model-driven Financial Analysis
Han Yuan, Li Zhang, Zheng Ma
TL;DR
The paper investigates the reliability of self-explanations accompanying LM-based financial classifications, focusing on zero-shot tasks in a finance domain. It uses three instruction-tuned LMs to classify a processed German credit dataset and annotates self-explanations for factuality and causality, testing their link to accuracy with $\chi^2$ analyses (significant at $P \le 0.05$). The key contributions show that both factuality and causality relate to classification performance, with factuality serving as a stronger proxy, and demonstrate that data preprocessing can boost both metrics and downstream decisions. This work supports using explanation quality as a proxy for confidence and as a lever to optimize LM-driven financial classification in practice.
Abstract
Language models (LMs) have exhibited exceptional versatility in reasoning and in-depth financial analysis through their proprietary information processing capabilities. Previous research focused on evaluating classification performance while often overlooking explainability or pre-conceived that refined explanation corresponds to higher classification accuracy. Using a public dataset in finance domain, we quantitatively evaluated self-explanations by LMs, focusing on their factuality and causality. We identified the statistically significant relationship between the accuracy of classifications and the factuality or causality of self-explanations. Our study built an empirical foundation for approximating classification confidence through self-explanations and for optimizing classification via proprietary reasoning.
