Robust Ante-hoc Graph Explainer using Bilevel Optimization
Kha-Dinh Luong, Mert Kosan, Arlei Lopes Da Silva, Ambuj Singh
TL;DR
The paper addresses the need for explainable graph predictions in high-stakes domains by proposing RAGE, a robust ante-hoc graph explainer that learns edge-level explanations jointly with a GNN using bilevel optimization. The method optimizes an edge-influence matrix $Z$ so that the adjacency is effectively $A_Z = Z \odot A$, and employs an inner loop to train the GNN parameters $\theta$ and an outer loop to update the explainer parameters $\Phi$ guiding $Z$. This yields compact, discriminative explanations while maintaining or improving predictive accuracy, demonstrated across eight molecular graph datasets, including four real-world benchmarks and three semi-synthetic ground-truth scenarios. RAGE achieves competitive or superior performance in both graph classification and explainability metrics, and shows enhanced reproducibility of predictor behavior when explanations are used to reproduce decisions. The work highlights the potential of bilevel, ante-hoc explainers to provide robust, reproducible insights for graph-based predictions with practical impact in molecular discovery and related domains.
Abstract
Explaining the decisions made by machine learning models for high-stakes applications is critical for increasing transparency and guiding improvements to these decisions. This is particularly true in the case of models for graphs, where decisions often depend on complex patterns combining rich structural and attribute data. While recent work has focused on designing so-called post-hoc explainers, the broader question of what constitutes a good explanation remains open. One intuitive property is that explanations should be sufficiently informative to reproduce the predictions given the data. In other words, a good explainer can be repurposed as a predictor. Post-hoc explainers do not achieve this goal as their explanations are highly dependent on fixed model parameters (e.g., learned GNN weights). To address this challenge, we propose RAGE (Robust Ante-hoc Graph Explainer), a novel and flexible ante-hoc explainer designed to discover explanations for graph neural networks using bilevel optimization, with a focus on the chemical domain. RAGE can effectively identify molecular substructures that contain the full information needed for prediction while enabling users to rank these explanations in terms of relevance. Our experiments on various molecular classification tasks show that RAGE explanations are better than existing post-hoc and ante-hoc approaches.
