Table of Contents
Fetching ...

Counterfactual Explanations of Black-box Machine Learning Models using Causal Discovery with Applications to Credit Rating

Daisuke Takahashi, Shohei Shimizu, Takuma Tanaka

TL;DR

The paper tackles the challenge of explaining black box predictions when causal relationships among features are unknown. It extends the LEWIS framework by integrating causal discovery with informative priors to estimate counterfactual explanations and the Nesuf score without requiring a prespecified causal graph, using $P(Y\mathrel{\mid}\mathrm{do}(X))$ and related probabilities. Through artificial experiments, it shows that incorporating prior information and causal discovery yields more accurate explanatory scores and correct feature ordering, even when the graph is not known. The approach is then applied to real credit rating data from Shiga Bank, where the inferred causal graph matches domain knowledge and yields meaningful explanations for rating decisions, highlighting practical impact in finance and beyond. The work lays groundwork for robust XAI in settings with unknown causal structure and suggests extensions to multi-class problems.

Abstract

Explainable artificial intelligence (XAI) has helped elucidate the internal mechanisms of machine learning algorithms, bolstering their reliability by demonstrating the basis of their predictions. Several XAI models consider causal relationships to explain models by examining the input-output relationships of prediction models and the dependencies between features. The majority of these models have been based their explanations on counterfactual probabilities, assuming that the causal graph is known. However, this assumption complicates the application of such models to real data, given that the causal relationships between features are unknown in most cases. Thus, this study proposed a novel XAI framework that relaxed the constraint that the causal graph is known. This framework leveraged counterfactual probabilities and additional prior information on causal structure, facilitating the integration of a causal graph estimated through causal discovery methods and a black-box classification model. Furthermore, explanatory scores were estimated based on counterfactual probabilities. Numerical experiments conducted employing artificial data confirmed the possibility of estimating the explanatory score more accurately than in the absence of a causal graph. Finally, as an application to real data, we constructed a classification model of credit ratings assigned by Shiga Bank, Shiga prefecture, Japan. We demonstrated the effectiveness of the proposed method in cases where the causal graph is unknown.

Counterfactual Explanations of Black-box Machine Learning Models using Causal Discovery with Applications to Credit Rating

TL;DR

The paper tackles the challenge of explaining black box predictions when causal relationships among features are unknown. It extends the LEWIS framework by integrating causal discovery with informative priors to estimate counterfactual explanations and the Nesuf score without requiring a prespecified causal graph, using and related probabilities. Through artificial experiments, it shows that incorporating prior information and causal discovery yields more accurate explanatory scores and correct feature ordering, even when the graph is not known. The approach is then applied to real credit rating data from Shiga Bank, where the inferred causal graph matches domain knowledge and yields meaningful explanations for rating decisions, highlighting practical impact in finance and beyond. The work lays groundwork for robust XAI in settings with unknown causal structure and suggests extensions to multi-class problems.

Abstract

Explainable artificial intelligence (XAI) has helped elucidate the internal mechanisms of machine learning algorithms, bolstering their reliability by demonstrating the basis of their predictions. Several XAI models consider causal relationships to explain models by examining the input-output relationships of prediction models and the dependencies between features. The majority of these models have been based their explanations on counterfactual probabilities, assuming that the causal graph is known. However, this assumption complicates the application of such models to real data, given that the causal relationships between features are unknown in most cases. Thus, this study proposed a novel XAI framework that relaxed the constraint that the causal graph is known. This framework leveraged counterfactual probabilities and additional prior information on causal structure, facilitating the integration of a causal graph estimated through causal discovery methods and a black-box classification model. Furthermore, explanatory scores were estimated based on counterfactual probabilities. Numerical experiments conducted employing artificial data confirmed the possibility of estimating the explanatory score more accurately than in the absence of a causal graph. Finally, as an application to real data, we constructed a classification model of credit ratings assigned by Shiga Bank, Shiga prefecture, Japan. We demonstrated the effectiveness of the proposed method in cases where the causal graph is unknown.
Paper Structure (16 sections, 8 equations, 7 figures, 10 tables)

This paper contains 16 sections, 8 equations, 7 figures, 10 tables.

Figures (7)

  • Figure 1: Framework of counterfactual probability explanations using causal structure information
  • Figure 2: Causal graph used in analysis. The values on the directed edges represent the coefficients of the respective structural equations.
  • Figure 3: Prior information of causal structure. (a): Target variable Y has the direct parent-child relationship with all explanatory variables. (b): Target variable Y is the sink variable.
  • Figure 4: Causal graph in artificial data experiments
  • Figure 5: Causal graph estimated by DirectLiNGAM (black lines). Prior information on the causal structure (red lines).
  • ...and 2 more figures