Table of Contents
Fetching ...

Global Human-guided Counterfactual Explanations for Molecular Properties via Reinforcement Learning

Danqing Wang, Antonis Antoniades, Kha-Dinh Luong, Edwin Zhang, Mert Kosan, Jiachen Li, Ambuj Singh, William Yang Wang, Lei Li

TL;DR

RLHEX tackles the challenge of producing global, human-aligned counterfactual explanations for molecular property prediction with graph neural networks. It combines a fragment-based PSVAE graph generator, a latent distribution adaptor, and PPO-based policy optimization to produce diverse yet chemically valid explanations that cover many input graphs. Across three real-world molecular datasets, RLHEX achieves higher coverage and lower explanatory distance compared to strong baselines, with expert chemists noting alignment with domain knowledge. The work advances interpretable AI for chemistry by delivering compact, domain-relevant global explanations and provides code for reproducibility.

Abstract

Counterfactual explanations of Graph Neural Networks (GNNs) offer a powerful way to understand data that can naturally be represented by a graph structure. Furthermore, in many domains, it is highly desirable to derive data-driven global explanations or rules that can better explain the high-level properties of the models and data in question. However, evaluating global counterfactual explanations is hard in real-world datasets due to a lack of human-annotated ground truth, which limits their use in areas like molecular sciences. Additionally, the increasing scale of these datasets provides a challenge for random search-based methods. In this paper, we develop a novel global explanation model RLHEX for molecular property prediction. It aligns the counterfactual explanations with human-defined principles, making the explanations more interpretable and easy for experts to evaluate. RLHEX includes a VAE-based graph generator to generate global explanations and an adapter to adjust the latent representation space to human-defined principles. Optimized by Proximal Policy Optimization (PPO), the global explanations produced by RLHEX cover 4.12% more input graphs and reduce the distance between the counterfactual explanation set and the input set by 0.47% on average across three molecular datasets. RLHEX provides a flexible framework to incorporate different human-designed principles into the counterfactual explanation generation process, aligning these explanations with domain expertise. The code and data are released at https://github.com/dqwang122/RLHEX.

Global Human-guided Counterfactual Explanations for Molecular Properties via Reinforcement Learning

TL;DR

RLHEX tackles the challenge of producing global, human-aligned counterfactual explanations for molecular property prediction with graph neural networks. It combines a fragment-based PSVAE graph generator, a latent distribution adaptor, and PPO-based policy optimization to produce diverse yet chemically valid explanations that cover many input graphs. Across three real-world molecular datasets, RLHEX achieves higher coverage and lower explanatory distance compared to strong baselines, with expert chemists noting alignment with domain knowledge. The work advances interpretable AI for chemistry by delivering compact, domain-relevant global explanations and provides code for reproducibility.

Abstract

Counterfactual explanations of Graph Neural Networks (GNNs) offer a powerful way to understand data that can naturally be represented by a graph structure. Furthermore, in many domains, it is highly desirable to derive data-driven global explanations or rules that can better explain the high-level properties of the models and data in question. However, evaluating global counterfactual explanations is hard in real-world datasets due to a lack of human-annotated ground truth, which limits their use in areas like molecular sciences. Additionally, the increasing scale of these datasets provides a challenge for random search-based methods. In this paper, we develop a novel global explanation model RLHEX for molecular property prediction. It aligns the counterfactual explanations with human-defined principles, making the explanations more interpretable and easy for experts to evaluate. RLHEX includes a VAE-based graph generator to generate global explanations and an adapter to adjust the latent representation space to human-defined principles. Optimized by Proximal Policy Optimization (PPO), the global explanations produced by RLHEX cover 4.12% more input graphs and reduce the distance between the counterfactual explanation set and the input set by 0.47% on average across three molecular datasets. RLHEX provides a flexible framework to incorporate different human-designed principles into the counterfactual explanation generation process, aligning these explanations with domain expertise. The code and data are released at https://github.com/dqwang122/RLHEX.
Paper Structure (25 sections, 13 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 13 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: RLHEX has three main parts - the VAE-based generation model, the adapter, and the reward module. The reward module contains several reward functions based on principles designed by humans, which make the generated explanations easier for domain experts to interpret. The adapter modifies the latent representation $z$ by adding the delta $z'$, which is optimized using PPO to align with the principles designed by humans. The RLHEX model uses the molecule to be explained as the input and creates the CF explanations from the modified latent representation $z + z'$.
  • Figure 2: Coverage and Cost with different size $k$ of explanation set. RLHEX generally outperforms the baselines on coverage and cost. We use iteration $i = 20$ for generation.
  • Figure 3: Coverage on AIDS dataset with different iterations $i$. We limit the explanation set with $k=10$ to calculate the coverage.
  • Figure 4: The counterfactual (CF) explanation generated for the closest input molecules from the AIDS and Dipole datasets. For each CF explanation, we compute the distance between it and the input molecules, selecting the top 5 input molecules for display. The generated CF explanation for AIDS exhibits a coverage of 0.231 over the input molecule set, while the CF explanation for the Dipole dataset shows a coverage of 0.209.