A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1

Giulia Di Teodoro; Federico Siciliano; Valerio Guarrasi; Anne-Mieke Vandamme; Valeria Ghisetti; Anders Sönnerborg; Maurizio Zazzi; Fabrizio Silvestri; Laura Palagi

A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1

Giulia Di Teodoro, Federico Siciliano, Valerio Guarrasi, Anne-Mieke Vandamme, Valeria Ghisetti, Anders Sönnerborg, Maurizio Zazzi, Fabrizio Silvestri, Laura Palagi

TL;DR

This work tackles HIV-1 ART outcome prediction when drugs are underrepresented or newly introduced by introducing MIX, a joint fusion of FC and GNN that combines tabular genotype features with a mutation–drug knowledge graph in which edges carry Stanford scores $s_{m_jd_k}$. The method is trained in two phases (pretraining of FC and GNN, then end-to-end fusion) and evaluated on both OoD and non-OoD settings, with input dimensionality $N_M+N_D=5970$. Results show that the MIX model generally outperforms a standalone FC model, particularly when test data include OoD drugs, while also delivering substantial gains in standard scenarios; this underscores stronger generalization and robustness for clinical decision support. The approach highlights the broader potential of knowledge-graph–driven multimodal modeling in infectious disease treatment prediction and invites future work on hyper-graphs and incorporating older mutations to further enhance generalization.

Abstract

Predicting the outcome of antiretroviral therapies (ART) for HIV-1 is a pressing clinical challenge, especially when the ART includes drugs with limited effectiveness data. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings, resulting in clinical dataset with highly unbalanced therapy representation. To tackle this issue, we introduce a novel joint fusion model, which combines features from a Fully Connected (FC) Neural Network and a Graph Neural Network (GNN) in a multi-modality fashion. Our model uses both tabular data about genetic sequences and a knowledge base derived from Stanford drug-resistance mutation tables, which serve as benchmark references for deducing in-vivo treatment efficacy based on the viral genetic sequence. By leveraging this knowledge base structured as a graph, the GNN component enables our model to adapt to imbalanced data distributions and account for Out-of-Distribution (OoD) drugs. We evaluated these models' robustness against OoD drugs in the test set. Our comprehensive analysis demonstrates that the proposed model consistently outperforms the FC model. These results underscore the advantage of integrating Stanford scores in the model, thereby enhancing its generalizability and robustness, but also extending its utility in contributing in more informed clinical decisions with limited data availability. The source code is available at https://github.com/federicosiciliano/graph-ood-hiv

A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1

TL;DR

. The method is trained in two phases (pretraining of FC and GNN, then end-to-end fusion) and evaluated on both OoD and non-OoD settings, with input dimensionality

. Results show that the MIX model generally outperforms a standalone FC model, particularly when test data include OoD drugs, while also delivering substantial gains in standard scenarios; this underscores stronger generalization and robustness for clinical decision support. The approach highlights the broader potential of knowledge-graph–driven multimodal modeling in infectious disease treatment prediction and invites future work on hyper-graphs and incorporating older mutations to further enhance generalization.

Abstract

Paper Structure (17 sections, 1 equation, 2 figures, 3 tables)

This paper contains 17 sections, 1 equation, 2 figures, 3 tables.

Introduction
Dataset
Methodology
Setting
Tabular data
Graph data
Datasets to simulate OoD features
Models
Fully Connected Neural Networks
Graph Neural Networks
MIX - Joint Fusion
Training
Experiments
Results
Conclusions
...and 2 more sections

Figures (2)

Figure 1: Schematic view of the pipeline.
Figure 1: Panels (a), (b) and (c) represent the models' test performance in Accuracy, ROC AUC score and PREC-REC AUC score, respectively. The histograms are divided per OoD features.

A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1

TL;DR

Abstract

A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1

Authors

TL;DR

Abstract

Table of Contents

Figures (2)