Table of Contents
Fetching ...

CF-OPT: Counterfactual Explanations for Structured Prediction

Germain Vivier-Ardisson, Alexandre Forel, Axel Parmentier, Thibaut Vidal

TL;DR

CF-OPT tackles the interpretability challenge of structured prediction pipelines by learning counterfactual explanations that are both close to the original input and plausible under the data manifold. The key idea is to model plausibility in latent space with a Variational Autoencoder and a latent hypersphere, coupled with a cost-aware VAE objective, then solve a first-order constrained optimization (MDMM) to find counterfactuals. The approach yields relative, absolute, and epsilon-explanations, and demonstrates improved plausibility and stability against adversarial perturbations in high-dimensional settings (e.g., Warcraft map shortest paths). The work advances explainable AI for complex, joint prediction-and-optimization systems and provides a scalable, principled framework for local sensitivity analysis with practical impact on transparency and trust in structured pipelines.

Abstract

Optimization layers in deep neural networks have enjoyed a growing popularity in structured learning, improving the state of the art on a variety of applications. Yet, these pipelines lack interpretability since they are made of two opaque layers: a highly non-linear prediction model, such as a deep neural network, and an optimization layer, which is typically a complex black-box solver. Our goal is to improve the transparency of such methods by providing counterfactual explanations. We build upon variational autoencoders a principled way of obtaining counterfactuals: working in the latent space leads to a natural notion of plausibility of explanations. We finally introduce a variant of the classic loss for VAE training that improves their performance in our specific structured context. These provide the foundations of CF-OPT, a first-order optimization algorithm that can find counterfactual explanations for a broad class of structured learning architectures. Our numerical results show that both close and plausible explanations can be obtained for problems from the recent literature.

CF-OPT: Counterfactual Explanations for Structured Prediction

TL;DR

CF-OPT tackles the interpretability challenge of structured prediction pipelines by learning counterfactual explanations that are both close to the original input and plausible under the data manifold. The key idea is to model plausibility in latent space with a Variational Autoencoder and a latent hypersphere, coupled with a cost-aware VAE objective, then solve a first-order constrained optimization (MDMM) to find counterfactuals. The approach yields relative, absolute, and epsilon-explanations, and demonstrates improved plausibility and stability against adversarial perturbations in high-dimensional settings (e.g., Warcraft map shortest paths). The work advances explainable AI for complex, joint prediction-and-optimization systems and provides a scalable, principled framework for local sensitivity analysis with practical impact on transparency and trust in structured pipelines.

Abstract

Optimization layers in deep neural networks have enjoyed a growing popularity in structured learning, improving the state of the art on a variety of applications. Yet, these pipelines lack interpretability since they are made of two opaque layers: a highly non-linear prediction model, such as a deep neural network, and an optimization layer, which is typically a complex black-box solver. Our goal is to improve the transparency of such methods by providing counterfactual explanations. We build upon variational autoencoders a principled way of obtaining counterfactuals: working in the latent space leads to a natural notion of plausibility of explanations. We finally introduce a variant of the classic loss for VAE training that improves their performance in our specific structured context. These provide the foundations of CF-OPT, a first-order optimization algorithm that can find counterfactual explanations for a broad class of structured learning architectures. Our numerical results show that both close and plausible explanations can be obtained for problems from the recent literature.
Paper Structure (44 sections, 22 equations, 17 figures, 1 table, 2 algorithms)

This paper contains 44 sections, 22 equations, 17 figures, 1 table, 2 algorithms.

Figures (17)

  • Figure 1: (a) Initial and (b) counterfactual maps with their respective shortest path (initial and alternative solutions) shown in yellow. The explanation is given by CF-OPT. The corresponding pipeline and experiment is detailed in \ref{['sec:exp_warcraft']} and will serve as a guiding example in this work.
  • Figure 2: Structured learning pipeline.
  • Figure 3: Naive counterfactual search in raw feature space leads to adversarial examples. CF-OPT recovers a plausible explanation, using a VAE trained in a cost-aware fashion, and a latent hypersphere plausibility regularized objective ($\alpha=2, \beta=10$).
  • Figure 4: Pipeline of CF-OPT for plausible explanations in high-dimensional spaces. The input $x$ is encoded and decoded using a CA-VAE. The reconstructed context $\tilde{x}$ is given to the prediction model $\varphi$, a CNN in our guiding example, to obtain the parameters $\theta$. The parameterized optimization model is finally solved to obtain the decision $y^*$.
  • Figure 5: Comparison of VAE and Cost-Aware VAE for varying $\alpha$
  • ...and 12 more figures

Theorems & Definitions (9)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Remark 2.4
  • Remark 3.1
  • Remark 3.2
  • Remark 4.1
  • Remark 4.2
  • Remark 2.1