Table of Contents
Fetching ...

A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1

Giulia Di Teodoro, Federico Siciliano, Valerio Guarrasi, Anne-Mieke Vandamme, Valeria Ghisetti, Anders Sönnerborg, Maurizio Zazzi, Fabrizio Silvestri, Laura Palagi

TL;DR

This work tackles HIV-1 ART outcome prediction when drugs are underrepresented or newly introduced by introducing MIX, a joint fusion of FC and GNN that combines tabular genotype features with a mutation–drug knowledge graph in which edges carry Stanford scores $s_{m_jd_k}$. The method is trained in two phases (pretraining of FC and GNN, then end-to-end fusion) and evaluated on both OoD and non-OoD settings, with input dimensionality $N_M+N_D=5970$. Results show that the MIX model generally outperforms a standalone FC model, particularly when test data include OoD drugs, while also delivering substantial gains in standard scenarios; this underscores stronger generalization and robustness for clinical decision support. The approach highlights the broader potential of knowledge-graph–driven multimodal modeling in infectious disease treatment prediction and invites future work on hyper-graphs and incorporating older mutations to further enhance generalization.

Abstract

Predicting the outcome of antiretroviral therapies (ART) for HIV-1 is a pressing clinical challenge, especially when the ART includes drugs with limited effectiveness data. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings, resulting in clinical dataset with highly unbalanced therapy representation. To tackle this issue, we introduce a novel joint fusion model, which combines features from a Fully Connected (FC) Neural Network and a Graph Neural Network (GNN) in a multi-modality fashion. Our model uses both tabular data about genetic sequences and a knowledge base derived from Stanford drug-resistance mutation tables, which serve as benchmark references for deducing in-vivo treatment efficacy based on the viral genetic sequence. By leveraging this knowledge base structured as a graph, the GNN component enables our model to adapt to imbalanced data distributions and account for Out-of-Distribution (OoD) drugs. We evaluated these models' robustness against OoD drugs in the test set. Our comprehensive analysis demonstrates that the proposed model consistently outperforms the FC model. These results underscore the advantage of integrating Stanford scores in the model, thereby enhancing its generalizability and robustness, but also extending its utility in contributing in more informed clinical decisions with limited data availability. The source code is available at https://github.com/federicosiciliano/graph-ood-hiv

A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1

TL;DR

This work tackles HIV-1 ART outcome prediction when drugs are underrepresented or newly introduced by introducing MIX, a joint fusion of FC and GNN that combines tabular genotype features with a mutation–drug knowledge graph in which edges carry Stanford scores . The method is trained in two phases (pretraining of FC and GNN, then end-to-end fusion) and evaluated on both OoD and non-OoD settings, with input dimensionality . Results show that the MIX model generally outperforms a standalone FC model, particularly when test data include OoD drugs, while also delivering substantial gains in standard scenarios; this underscores stronger generalization and robustness for clinical decision support. The approach highlights the broader potential of knowledge-graph–driven multimodal modeling in infectious disease treatment prediction and invites future work on hyper-graphs and incorporating older mutations to further enhance generalization.

Abstract

Predicting the outcome of antiretroviral therapies (ART) for HIV-1 is a pressing clinical challenge, especially when the ART includes drugs with limited effectiveness data. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings, resulting in clinical dataset with highly unbalanced therapy representation. To tackle this issue, we introduce a novel joint fusion model, which combines features from a Fully Connected (FC) Neural Network and a Graph Neural Network (GNN) in a multi-modality fashion. Our model uses both tabular data about genetic sequences and a knowledge base derived from Stanford drug-resistance mutation tables, which serve as benchmark references for deducing in-vivo treatment efficacy based on the viral genetic sequence. By leveraging this knowledge base structured as a graph, the GNN component enables our model to adapt to imbalanced data distributions and account for Out-of-Distribution (OoD) drugs. We evaluated these models' robustness against OoD drugs in the test set. Our comprehensive analysis demonstrates that the proposed model consistently outperforms the FC model. These results underscore the advantage of integrating Stanford scores in the model, thereby enhancing its generalizability and robustness, but also extending its utility in contributing in more informed clinical decisions with limited data availability. The source code is available at https://github.com/federicosiciliano/graph-ood-hiv
Paper Structure (17 sections, 1 equation, 2 figures, 3 tables)

This paper contains 17 sections, 1 equation, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Schematic view of the pipeline.
  • Figure 1: Panels (a), (b) and (c) represent the models' test performance in Accuracy, ROC AUC score and PREC-REC AUC score, respectively. The histograms are divided per OoD features.