Flexible Counterfactual Explanations with Generative Models
Stig Hellemans, Andres Algaba, Sam Verboven, Vincent Ginis
TL;DR
FCEGAN addresses rigidity in counterfactual explanations by introducing counterfactual templates that let users dynamically specify mutability of features, while enabling black-box operation through historical predictions. The framework combines a GAN-based generator with dual discriminators and a divergence term, integrated with gradient-guided optimization, to produce realistic, valid counterfactuals aligned with user constraints. Experiments on healthcare and finance datasets show improved validity and usable explanations under varying flexibility, albeit with some diversity trade-offs that can be mitigated with divergence controls. The approach offers practical, personalized explanations in real-world, constraint-heterogeneous settings, without requiring retraining or model access, supporting deployment in high-stakes domains and regulated environments.
Abstract
Counterfactual explanations provide actionable insights to achieve desired outcomes by suggesting minimal changes to input features. However, existing methods rely on fixed sets of mutable features, which makes counterfactual explanations inflexible for users with heterogeneous real-world constraints. Here, we introduce Flexible Counterfactual Explanations, a framework incorporating counterfactual templates, which allows users to dynamically specify mutable features at inference time. In our implementation, we use Generative Adversarial Networks (FCEGAN), which align explanations with user-defined constraints without requiring model retraining or additional optimization. Furthermore, FCEGAN is designed for black-box scenarios, leveraging historical prediction datasets to generate explanations without direct access to model internals. Experiments across economic and healthcare datasets demonstrate that FCEGAN significantly improves counterfactual explanations' validity compared to traditional benchmark methods. By integrating user-driven flexibility and black-box compatibility, counterfactual templates support personalized explanations tailored to user constraints.
