Table of Contents
Fetching ...

Graph Edits for Counterfactual Explanations: A comparative study

Angeliki Dimitriou, Nikolaos Chaidos, Maria Lymperaiou, Giorgos Stamou

TL;DR

This paper investigates graph edits as a means of producing conceptual counterfactual explanations for visual classifiers by representing images as scene graphs and measuring semantic edits via graph edit distance. It compares supervised GNNs, unsupervised Graph Autoencoders, and graph kernels to approximate GED for fast, scalable retrieval of minimal, semantically meaningful counterfactuals in a black-box setting, guided by WordNet distances and optimized with graph-based techniques. The approach scales by using the Volgenant-Jonker optimization framework to reduce the NP-hard GED burden, while maintaining a ground-truth reference via closest GED changes. Experiments on Visual Genome with two density-based splits show that supervised GNNs often yield better retrieval quality, whereas GAEs offer faster training and competitive performance, underscoring a trade-off between accuracy and scalability. Overall, the work demonstrates the value of graph-based representations for interpretable, semantically grounded counterfactual explanations in complex visual scenes.

Abstract

Counterfactuals have been established as a popular explainability technique which leverages a set of minimal edits to alter the prediction of a classifier. When considering conceptual counterfactuals on images, the edits requested should correspond to salient concepts present in the input data. At the same time, conceptual distances are defined by knowledge graphs, ensuring the optimality of conceptual edits. In this work, we extend previous endeavors on graph edits as counterfactual explanations by conducting a comparative study which encompasses both supervised and unsupervised Graph Neural Network (GNN) approaches. To this end, we pose the following significant research question: should we represent input data as graphs, which is the optimal GNN approach in terms of performance and time efficiency to generate minimal and meaningful counterfactual explanations for black-box image classifiers?

Graph Edits for Counterfactual Explanations: A comparative study

TL;DR

This paper investigates graph edits as a means of producing conceptual counterfactual explanations for visual classifiers by representing images as scene graphs and measuring semantic edits via graph edit distance. It compares supervised GNNs, unsupervised Graph Autoencoders, and graph kernels to approximate GED for fast, scalable retrieval of minimal, semantically meaningful counterfactuals in a black-box setting, guided by WordNet distances and optimized with graph-based techniques. The approach scales by using the Volgenant-Jonker optimization framework to reduce the NP-hard GED burden, while maintaining a ground-truth reference via closest GED changes. Experiments on Visual Genome with two density-based splits show that supervised GNNs often yield better retrieval quality, whereas GAEs offer faster training and competitive performance, underscoring a trade-off between accuracy and scalability. Overall, the work demonstrates the value of graph-based representations for interpretable, semantically grounded counterfactual explanations in complex visual scenes.

Abstract

Counterfactuals have been established as a popular explainability technique which leverages a set of minimal edits to alter the prediction of a classifier. When considering conceptual counterfactuals on images, the edits requested should correspond to salient concepts present in the input data. At the same time, conceptual distances are defined by knowledge graphs, ensuring the optimality of conceptual edits. In this work, we extend previous endeavors on graph edits as counterfactual explanations by conducting a comparative study which encompasses both supervised and unsupervised Graph Neural Network (GNN) approaches. To this end, we pose the following significant research question: should we represent input data as graphs, which is the optimal GNN approach in terms of performance and time efficiency to generate minimal and meaningful counterfactual explanations for black-box image classifiers?
Paper Structure (15 sections, 1 equation, 3 figures, 2 tables)

This paper contains 15 sections, 1 equation, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Left - person in front of parked car (safe). Right - person in front of moving cars on the highway (unsafe). The relationship between the person, the car and the highway is critical for the transition to the counterfactual class.
  • Figure 2: Outline of our evaluation framework: Given scene graphs of classes $A\neq B$, a graph model embeds them in a low-dimensional space, allowing the retrieval of the closest graphs $G_A$, $G_B$ from which we extract counterfactual edits.
  • Figure 3: Counterfactuals from the best supervised GNN, unsupervised GNN and kernel for VG-DENSE (top 3) and VG-RANDOM (bottom 2).