Table of Contents
Fetching ...

Disentangling Causal Substructures for Interpretable and Generalizable Drug Synergy Prediction

Yi Luo, Haochen Zhao, Xiao Liang, Yiwei Liu, Yuye Zhang, Xinyu Li, Jianxin Wang

TL;DR

CausalDDS is a novel framework that disentangles drug molecules into causal and spurious substructures, utilizing the causal substructure representations for predicting drug synergy, and outperforms baseline models, particularly in cold start and out-of-distribution settings.

Abstract

Drug synergy prediction is a critical task in the development of effective combination therapies for complex diseases, including cancer. Although existing methods have shown promising results, they often operate as black-box predictors that rely predominantly on statistical correlations between drug characteristics and results. To address this limitation, we propose CausalDDS, a novel framework that disentangles drug molecules into causal and spurious substructures, utilizing the causal substructure representations for predicting drug synergy. By focusing on causal sub-structures, CausalDDS effectively mitigates the impact of redundant features introduced by spurious substructures, enhancing the accuracy and interpretability of the model. In addition, CausalDDS employs a conditional intervention mechanism, where interventions are conditioned on paired molecular structures, and introduces a novel optimization objective guided by the principles of sufficiency and independence. Extensive experiments demonstrate that our method outperforms baseline models, particularly in cold start and out-of-distribution settings. Besides, CausalDDS effectively identifies key substructures underlying drug synergy, providing clear insights into how drug combinations work at the molecular level. These results underscore the potential of CausalDDS as a practical tool for predicting drug synergy and facilitating drug discovery.

Disentangling Causal Substructures for Interpretable and Generalizable Drug Synergy Prediction

TL;DR

CausalDDS is a novel framework that disentangles drug molecules into causal and spurious substructures, utilizing the causal substructure representations for predicting drug synergy, and outperforms baseline models, particularly in cold start and out-of-distribution settings.

Abstract

Drug synergy prediction is a critical task in the development of effective combination therapies for complex diseases, including cancer. Although existing methods have shown promising results, they often operate as black-box predictors that rely predominantly on statistical correlations between drug characteristics and results. To address this limitation, we propose CausalDDS, a novel framework that disentangles drug molecules into causal and spurious substructures, utilizing the causal substructure representations for predicting drug synergy. By focusing on causal sub-structures, CausalDDS effectively mitigates the impact of redundant features introduced by spurious substructures, enhancing the accuracy and interpretability of the model. In addition, CausalDDS employs a conditional intervention mechanism, where interventions are conditioned on paired molecular structures, and introduces a novel optimization objective guided by the principles of sufficiency and independence. Extensive experiments demonstrate that our method outperforms baseline models, particularly in cold start and out-of-distribution settings. Besides, CausalDDS effectively identifies key substructures underlying drug synergy, providing clear insights into how drug combinations work at the molecular level. These results underscore the potential of CausalDDS as a practical tool for predicting drug synergy and facilitating drug discovery.

Paper Structure

This paper contains 25 sections, 14 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Overview of the CausalDDS framework. The inputs to the model are the SMILES string of two drugs and the gene expression data of the cell line. The output is a predicted drug synergy score or binary label. The drug molecules are first encoded using Graph Isomorphism Networks (GINs), where each row in the resulting representation corresponds to an aggregated embedding of local atomic and bond-level substructures. Then, the encoded representations of two drugs are fed into a molecular interaction module to capture pairwise atom-level interactions and derive interaction-aware molecular representations. Thirdly, the causal disentangle module uses an MLP to estimate an atom-level importance score from the embedding matrix. Based on the importance score, we separate the causal substructure and the spurious substructure by masking the molecular representation. Finally, the causal substructure representations of the selected drug pair, together with the gene expression data of the target cell line, are jointly input into a fully connected neural network to predict drug synergy. The conditional causal intervention module is given a drug pair; spurious substructures are extracted by modeling their interactions with other drugs in the training set, and are then used to optimize the model through conditional intervention. Optimization objective defined according to the principles of sufficiency and independence. The final loss function is defined as the sum of three losses.
  • Figure 2: Performance evaluation with scarce data on the DrugCombDB dataset
  • Figure 3: Performance evaluation on the AstraZeneca Dataset using models trained on the DrugCombDB Dataset
  • Figure 4: Interpretability analysis of CausalDDS for explaining drug synergy mechanisms. a, Visualization of the causal substructures underlying the synergistic drug combination of sunitinib and erlotinib in the COLO320 cell line. The substructure indolinone for sunitinib and the substructure quinazoline for erlotinib, playing decisive roles, are highlighted with green dashed boxes. b, Visualization of the causal substructures underlying the synergistic drug combination of raloxifene and mercaptopurine in the SK-MEL-28 cell line. The substructure phenolic hydroxyl group and diaryl benzothiophene rigid scaffold for raloxifene and the substructure N-9 glycosylation site for Mercaptopurine are highlighted with green dashed boxes.
  • Figure 5: The case study of structural optimization for drug synergy with CausalDDS. a, The left side shows the 2D structure of lapatinib, and the right side shows the 2D structure of afatinib. The yellow-highlighted part is the maximum common structure of the two drugs found by the Rdkit. The score is the Tanimoto similarity. b, The left side shows the 2D structure of erlotinib, and the right side shows the 2D structure of gefitinib.