CF-OPT: Counterfactual Explanations for Structured Prediction
Germain Vivier-Ardisson, Alexandre Forel, Axel Parmentier, Thibaut Vidal
TL;DR
CF-OPT tackles the interpretability challenge of structured prediction pipelines by learning counterfactual explanations that are both close to the original input and plausible under the data manifold. The key idea is to model plausibility in latent space with a Variational Autoencoder and a latent hypersphere, coupled with a cost-aware VAE objective, then solve a first-order constrained optimization (MDMM) to find counterfactuals. The approach yields relative, absolute, and epsilon-explanations, and demonstrates improved plausibility and stability against adversarial perturbations in high-dimensional settings (e.g., Warcraft map shortest paths). The work advances explainable AI for complex, joint prediction-and-optimization systems and provides a scalable, principled framework for local sensitivity analysis with practical impact on transparency and trust in structured pipelines.
Abstract
Optimization layers in deep neural networks have enjoyed a growing popularity in structured learning, improving the state of the art on a variety of applications. Yet, these pipelines lack interpretability since they are made of two opaque layers: a highly non-linear prediction model, such as a deep neural network, and an optimization layer, which is typically a complex black-box solver. Our goal is to improve the transparency of such methods by providing counterfactual explanations. We build upon variational autoencoders a principled way of obtaining counterfactuals: working in the latent space leads to a natural notion of plausibility of explanations. We finally introduce a variant of the classic loss for VAE training that improves their performance in our specific structured context. These provide the foundations of CF-OPT, a first-order optimization algorithm that can find counterfactual explanations for a broad class of structured learning architectures. Our numerical results show that both close and plausible explanations can be obtained for problems from the recent literature.
