Table of Contents
Fetching ...

ARM-Explainer -- Explaining and improving graph neural network predictions for the maximum clique problem using node features and association rule mining

Bharat Sharman, Elkafi Hassini

TL;DR

The paper tackles explainability for graph neural networks tackling graph-based combinatorial optimization, focusing on the maximum clique problem. It introduces ARM-Explainer, a post-hoc, model-level explainer that uses FP-Growth association rule mining to derive human-readable rules from GNN predictions and node features. Empirically, augmenting the HGS GNN with additional node features improves MCP performance on large graphs by up to 22%, and ARM-Explainer uncovers dataset-specific, high-quality rules with strong lift and confidence. The work opens avenues for applying ARM-based explanations to other graph COPs and exploring alternative ARM approaches and self-interpretable COP models.

Abstract

Numerous graph neural network (GNN)-based algorithms have been proposed to solve graph-based combinatorial optimization problems (COPs), but methods to explain their predictions remain largely undeveloped. We introduce ARM-Explainer, a post-hoc, model-level explainer based on association rule mining, and demonstrate it on the predictions of the hybrid geometric scattering (HGS) GNN for the maximum clique problem (MCP), a canonical NP-hard graph-based COP. The eight most explanatory association rules discovered by ARM-Explainer achieve high median lift and confidence values of 2.42 and 0.49, respectively, on test instances from the TWITTER and BHOSLIB-DIMACS benchmark datasets. ARM-Explainer identifies the most important node features, together with their value ranges, that influence the GNN's predictions on these datasets. Furthermore, augmenting the GNN with informative node features substantially improves its performance on the MCP, increasing the median largest-found clique size by 22% (from 29.5 to 36) on large graphs from the BHOSLIB-DIMACS dataset.

ARM-Explainer -- Explaining and improving graph neural network predictions for the maximum clique problem using node features and association rule mining

TL;DR

The paper tackles explainability for graph neural networks tackling graph-based combinatorial optimization, focusing on the maximum clique problem. It introduces ARM-Explainer, a post-hoc, model-level explainer that uses FP-Growth association rule mining to derive human-readable rules from GNN predictions and node features. Empirically, augmenting the HGS GNN with additional node features improves MCP performance on large graphs by up to 22%, and ARM-Explainer uncovers dataset-specific, high-quality rules with strong lift and confidence. The work opens avenues for applying ARM-based explanations to other graph COPs and exploring alternative ARM approaches and self-interpretable COP models.

Abstract

Numerous graph neural network (GNN)-based algorithms have been proposed to solve graph-based combinatorial optimization problems (COPs), but methods to explain their predictions remain largely undeveloped. We introduce ARM-Explainer, a post-hoc, model-level explainer based on association rule mining, and demonstrate it on the predictions of the hybrid geometric scattering (HGS) GNN for the maximum clique problem (MCP), a canonical NP-hard graph-based COP. The eight most explanatory association rules discovered by ARM-Explainer achieve high median lift and confidence values of 2.42 and 0.49, respectively, on test instances from the TWITTER and BHOSLIB-DIMACS benchmark datasets. ARM-Explainer identifies the most important node features, together with their value ranges, that influence the GNN's predictions on these datasets. Furthermore, augmenting the GNN with informative node features substantially improves its performance on the MCP, increasing the median largest-found clique size by 22% (from 29.5 to 36) on large graphs from the BHOSLIB-DIMACS dataset.

Paper Structure

This paper contains 20 sections, 5 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of the ARM-Explainer- The part highlighted in blue is the ARM-based explanation module. The node features and GNN predictions (node probabilities for belonging to a maximum clique) are used across all the graph instances are used to derive post-hoc model-level explainable rules that serve as explanations for GNN predictions
  • Figure 2: Node count, edge count, and density distribution of 583 training and 196 test graph instances of the TWITTER dataset
  • Figure 3: Node count, edge count, and density distribution of 76 training and 34 test graph instances of the BHOSLIB and DIMACS datasets
  • Figure 4: Correlation between node features of TWITTER graphs (train and test combined).
  • Figure 5: Correlation between node features of BHOSLIB and DIMACS graphs (train and test combined). Since eccentricity has a constant value of 2 for all graph instances in these datasets, its correlations are omitted.