Table of Contents
Fetching ...

Probabilistically Plausible Counterfactual Explanations with Normalizing Flows

Patryk Wielopolski, Oleksii Furman, Jerzy Stefanowski, Maciej Zięba

TL;DR

PPCEF addresses the need for counterfactual explanations that are not only valid (achieve the target class) but also probabilistically plausible with respect to the data distribution in high-dimensional tabular data. It proposes an unconstrained optimization framework that directly optimizes a density-based plausibility term p(x|y) estimated by conditional normalizing flows (MAF), alongside a distance term and a discriminative validity term. The approach enables efficient gradient-based counterfactual estimation with batch processing and demonstrates superior plausibility, validity, and speed compared to baselines across multiple datasets and classifiers. By modeling complex data distributions without restricting to a fixed parametric family, PPCEF improves trust and actionability of explanations and opens pathways for incorporating future constraints like sparsity or actionability in counterfactuals.

Abstract

We present PPCEF, a novel method for generating probabilistically plausible counterfactual explanations (CFs). PPCEF advances beyond existing methods by combining a probabilistic formulation that leverages the data distribution with the optimization of plausibility within a unified framework. Compared to reference approaches, our method enforces plausibility by directly optimizing the explicit density function without assuming a particular family of parametrized distributions. This ensures CFs are not only valid (i.e., achieve class change) but also align with the underlying data's probability density. For that purpose, our approach leverages normalizing flows as powerful density estimators to capture the complex high-dimensional data distribution. Furthermore, we introduce a novel loss that balances the trade-off between achieving class change and maintaining closeness to the original instance while also incorporating a probabilistic plausibility term. PPCEF's unconstrained formulation allows for efficient gradient-based optimization with batch processing, leading to orders of magnitude faster computation compared to prior methods. Moreover, the unconstrained formulation of PPCEF allows for the seamless integration of future constraints tailored to specific counterfactual properties. Finally, extensive evaluations demonstrate PPCEF's superiority in generating high-quality, probabilistically plausible counterfactual explanations in high-dimensional tabular settings. This makes PPCEF a powerful tool for not only interpreting complex machine learning models but also for improving fairness, accountability, and trust in AI systems.

Probabilistically Plausible Counterfactual Explanations with Normalizing Flows

TL;DR

PPCEF addresses the need for counterfactual explanations that are not only valid (achieve the target class) but also probabilistically plausible with respect to the data distribution in high-dimensional tabular data. It proposes an unconstrained optimization framework that directly optimizes a density-based plausibility term p(x|y) estimated by conditional normalizing flows (MAF), alongside a distance term and a discriminative validity term. The approach enables efficient gradient-based counterfactual estimation with batch processing and demonstrates superior plausibility, validity, and speed compared to baselines across multiple datasets and classifiers. By modeling complex data distributions without restricting to a fixed parametric family, PPCEF improves trust and actionability of explanations and opens pathways for incorporating future constraints like sparsity or actionability in counterfactuals.

Abstract

We present PPCEF, a novel method for generating probabilistically plausible counterfactual explanations (CFs). PPCEF advances beyond existing methods by combining a probabilistic formulation that leverages the data distribution with the optimization of plausibility within a unified framework. Compared to reference approaches, our method enforces plausibility by directly optimizing the explicit density function without assuming a particular family of parametrized distributions. This ensures CFs are not only valid (i.e., achieve class change) but also align with the underlying data's probability density. For that purpose, our approach leverages normalizing flows as powerful density estimators to capture the complex high-dimensional data distribution. Furthermore, we introduce a novel loss that balances the trade-off between achieving class change and maintaining closeness to the original instance while also incorporating a probabilistic plausibility term. PPCEF's unconstrained formulation allows for efficient gradient-based optimization with batch processing, leading to orders of magnitude faster computation compared to prior methods. Moreover, the unconstrained formulation of PPCEF allows for the seamless integration of future constraints tailored to specific counterfactual properties. Finally, extensive evaluations demonstrate PPCEF's superiority in generating high-quality, probabilistically plausible counterfactual explanations in high-dimensional tabular settings. This makes PPCEF a powerful tool for not only interpreting complex machine learning models but also for improving fairness, accountability, and trust in AI systems.
Paper Structure (26 sections, 11 equations, 1 figure, 10 tables)

This paper contains 26 sections, 11 equations, 1 figure, 10 tables.

Figures (1)

  • Figure 1: Probabilistically Plausibile Counterfactual Explanation Estimation Process on the Moons Dataset. We show an evolution of an instance from the initial instance (black dot) to the final counterfactual (red dot) against the linear classifier's decision boundary (blue line) and density threshold contours, highlighting the method's trajectory towards achieving target classification and probabilistic plausibility condition.