Table of Contents
Fetching ...

CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks

Ana Lucic, Maartje ter Hoeve, Gabriele Tolomei, Maarten de Rijke, Fabrizio Silvestri

TL;DR

CF-GNNExplainer introduces a graph-specific counterfactual explanation method that perturbatively prunes edges in a node's local subgraph to flip predictions. By framing CF generation as an adjacency perturbation problem and optimizing a joint loss, it yields minimal, sparse, and accurate counterfactuals across three synthetic GNN explanation datasets. The approach outperforms baselines and a standard GNNExplainability method in the CF setting, while highlighting the importance of evaluation protocols and societal considerations for CF explanations. Limitations include edge deletions only and node-classification focus, with planned extensions to feature perturbations and graph-level tasks.

Abstract

Given the increasing promise of graph neural networks (GNNs) in real-world applications, several methods have been developed for explaining their predictions. Existing methods for interpreting predictions from GNNs have primarily focused on generating subgraphs that are especially relevant for a particular prediction. However, such methods are not counterfactual (CF) in nature: given a prediction, we want to understand how the prediction can be changed in order to achieve an alternative outcome. In this work, we propose a method for generating CF explanations for GNNs: the minimal perturbation to the input (graph) data such that the prediction changes. Using only edge deletions, we find that our method, CF-GNNExplainer, can generate CF explanations for the majority of instances across three widely used datasets for GNN explanations, while removing less than 3 edges on average, with at least 94\% accuracy. This indicates that CF-GNNExplainer primarily removes edges that are crucial for the original predictions, resulting in minimal CF explanations.

CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks

TL;DR

CF-GNNExplainer introduces a graph-specific counterfactual explanation method that perturbatively prunes edges in a node's local subgraph to flip predictions. By framing CF generation as an adjacency perturbation problem and optimizing a joint loss, it yields minimal, sparse, and accurate counterfactuals across three synthetic GNN explanation datasets. The approach outperforms baselines and a standard GNNExplainability method in the CF setting, while highlighting the importance of evaluation protocols and societal considerations for CF explanations. Limitations include edge deletions only and node-classification focus, with planned extensions to feature perturbations and graph-level tasks.

Abstract

Given the increasing promise of graph neural networks (GNNs) in real-world applications, several methods have been developed for explaining their predictions. Existing methods for interpreting predictions from GNNs have primarily focused on generating subgraphs that are especially relevant for a particular prediction. However, such methods are not counterfactual (CF) in nature: given a prediction, we want to understand how the prediction can be changed in order to achieve an alternative outcome. In this work, we propose a method for generating CF explanations for GNNs: the minimal perturbation to the input (graph) data such that the prediction changes. Using only edge deletions, we find that our method, CF-GNNExplainer, can generate CF explanations for the majority of instances across three widely used datasets for GNN explanations, while removing less than 3 edges on average, with at least 94\% accuracy. This indicates that CF-GNNExplainer primarily removes edges that are crucial for the original predictions, resulting in minimal CF explanations.

Paper Structure

This paper contains 26 sections, 5 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Intuition of counterfactual example generation by CF-GNNExplainer.
  • Figure 2: Histograms showing the proportion of CF examples that have a certain explanation size from random. Note the $x$-axis for ba-shapes goes up to 1500. Left: tree-cycles, Middle: tree-grid, Right: ba-shapes.