Model-Based Counterfactual Explanations Incorporating Feature Space Attributes for Tabular Data
Yuta Sumiya, Hayaru shouno
TL;DR
This work tackles counterfactual explanations for tabular data by addressing categorical perturbations through TargetEncoding and by learning a latent CF generator with normalizing flows (FastDCFlow). The method optimizes a joint objective over likelihood, validity, and proximity to generate diverse, proximal CFs rapidly, enabling per-input CF sets without heavy optimization. TE improves perturbation realism and diversity, while the normalizing-flow latent space supports efficient, model-based CF generation that balances multiple quality metrics. Empirical results on three open datasets show FastDCFlow achieving strong diversity and proximity, competitive validity, and superior speed compared with baselines, highlighting its practical potential for real-world decision support with tabular data.
Abstract
Machine-learning models, which are known to accurately predict patterns from large datasets, are crucial in decision making. Consequently, counterfactual explanations-methods explaining predictions by introducing input perturbations-have become prominent. These perturbations often suggest ways to alter the predictions, leading to actionable recommendations. However, the current techniques require resolving the optimization problems for each input change, rendering them computationally expensive. In addition, traditional encoding methods inadequately address the perturbations of categorical variables in tabular data. Thus, this study propose FastDCFlow, an efficient counterfactual explanation method using normalizing flows. The proposed method captures complex data distributions, learns meaningful latent spaces that retain proximity, and improves predictions. For categorical variables, we employed TargetEncoding, which respects ordinal relationships and includes perturbation costs. The proposed method outperformed existing methods in multiple metrics, striking a balance between trade offs for counterfactual explanations. The source code is available in the following repository: https://github.com/sumugit/FastDCFlow.
