Table of Contents
Fetching ...

GraphPINE: Graph Importance Propagation for Interpretable Drug Response Prediction

Yoshitaka Inoue, Tianfan Fu, Augustin Luna

TL;DR

GraphPINE introduces an interpretable graph neural network for drug response prediction by initializing node importance with domain-specific prior knowledge from drug–gene interactions and propagating this importance through a novel IP Layer. The method unifies feature updates and graph-based importance propagation, enhancing both predictive performance and interpretability. On a large NCI-60–derived DRP dataset, GraphPINE (GT variant) achieves a PR-AUC of 0.894 and a ROC-AUC of 0.796, with ablations showing the IP layer substantially improves performance across architectures, and interpretability analyses revealing biologically meaningful gene–drug associations. The approach offers a principled framework for incorporating priors into GNNs, improving explainability without sacrificing accuracy, and holds potential for wider applications in network biology and precision medicine.

Abstract

Explainability is necessary for many tasks in biomedical research. Recent explainability methods have focused on attention, gradient, and Shapley value. These do not handle data with strong associated prior knowledge and fail to constrain explainability results based on known relationships between predictive features. We propose GraphPINE, a graph neural network (GNN) architecture leveraging domain-specific prior knowledge to initialize node importance optimized during training for drug response prediction. Typically, a manual post-prediction step examines literature (i.e., prior knowledge) to understand returned predictive features. While node importance can be obtained for gradient and attention after prediction, node importance from these methods lacks complementary prior knowledge; GraphPINE seeks to overcome this limitation. GraphPINE differs from other GNN gating methods by utilizing an LSTM-like sequential format. We introduce an importance propagation layer that unifies 1) updates for feature matrix and node importance and 2) uses GNN-based graph propagation of feature values. This initialization and updating mechanism allows for informed feature learning and improved graph representation. We apply GraphPINE to cancer drug response prediction using drug screening and gene data collected for over 5,000 gene nodes included in a gene-gene graph with a drug-target interaction (DTI) graph for initial importance. The gene-gene graph and DTIs were obtained from curated sources and weighted by article count discussing relationships between drugs and genes. GraphPINE achieves a PR-AUC of 0.894 and ROC-AUC of 0.796 across 952 drugs. Code is available at https://anonymous.4open.science/r/GraphPINE-40DE.

GraphPINE: Graph Importance Propagation for Interpretable Drug Response Prediction

TL;DR

GraphPINE introduces an interpretable graph neural network for drug response prediction by initializing node importance with domain-specific prior knowledge from drug–gene interactions and propagating this importance through a novel IP Layer. The method unifies feature updates and graph-based importance propagation, enhancing both predictive performance and interpretability. On a large NCI-60–derived DRP dataset, GraphPINE (GT variant) achieves a PR-AUC of 0.894 and a ROC-AUC of 0.796, with ablations showing the IP layer substantially improves performance across architectures, and interpretability analyses revealing biologically meaningful gene–drug associations. The approach offers a principled framework for incorporating priors into GNNs, improving explainability without sacrificing accuracy, and holds potential for wider applications in network biology and precision medicine.

Abstract

Explainability is necessary for many tasks in biomedical research. Recent explainability methods have focused on attention, gradient, and Shapley value. These do not handle data with strong associated prior knowledge and fail to constrain explainability results based on known relationships between predictive features. We propose GraphPINE, a graph neural network (GNN) architecture leveraging domain-specific prior knowledge to initialize node importance optimized during training for drug response prediction. Typically, a manual post-prediction step examines literature (i.e., prior knowledge) to understand returned predictive features. While node importance can be obtained for gradient and attention after prediction, node importance from these methods lacks complementary prior knowledge; GraphPINE seeks to overcome this limitation. GraphPINE differs from other GNN gating methods by utilizing an LSTM-like sequential format. We introduce an importance propagation layer that unifies 1) updates for feature matrix and node importance and 2) uses GNN-based graph propagation of feature values. This initialization and updating mechanism allows for informed feature learning and improved graph representation. We apply GraphPINE to cancer drug response prediction using drug screening and gene data collected for over 5,000 gene nodes included in a gene-gene graph with a drug-target interaction (DTI) graph for initial importance. The gene-gene graph and DTIs were obtained from curated sources and weighted by article count discussing relationships between drugs and genes. GraphPINE achieves a PR-AUC of 0.894 and ROC-AUC of 0.796 across 952 drugs. Code is available at https://anonymous.4open.science/r/GraphPINE-40DE.

Paper Structure

This paper contains 40 sections, 18 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: Overview of GraphPINE Components. (A) Importance Propagation (IP) Layer: This illustrates the key components of the IP Layer in the GraphPINE model, including the GNN, importance gating, feature updates with residual connections, importance propagation, and updates. The symbols represent the following operations: $\sigma$ is the activation function, $\odot$ is element-wise multiplication, $\times$ is multiplication, $+$ is addition, $W$ denotes weighted calculation with bias, $||$ represents concatenation, and $\alpha$ is a hyperparameter for controlling importance. (B) GraphPINE architecture. (C) Data Creation Overview: The model integrates multi-omics data (gene expression, copy number, methylation, mutation) from NCI60 shoemaker2006nci60 with gene-gene interaction networks from PathwayCommons cerami2010pathwaypc2019. Each edge has attributes such as "interact-with", which are converted into one-hot vectors for edge attribution.
  • Figure 2: Gene importance scores for 9-Methoxycamptothecin. Node size describes the propagated gene importance, and node color shows the initial DTI score.
  • Figure 3: Differences in Node (Gene) Ranks Before and After Propagation. Cosine sim.: Cosine similarity between initial/propagated importance rank. Spearman corr.: Spearman Rank correlation between initial/propagated importance rank. Rank changes: The percentage of genes whose ranks changed after propagation. Avg. shift: The average rank shift. Max up/down: Maximum upward/downward rank mobility.
  • Figure 3: Gene importance scores and interactions for Roscovitine derivative 1. Node size describes the propagated gene importance.
  • Figure 4: Distribution of Interactions Numbers Before/After Propagation. Initial interactions (blue) show a concentrated distribution near zero interactions, while Propagated interactions (orange) demonstrate a broader distribution centered around 2000 interactions.