Table of Contents
Fetching ...

Integrating Single-Cell Foundation Models with Graph Neural Networks for Drug Response Prediction

Till Rossner, Ziteng Li, Jonas Balke, Nikoo Salehfard, Tom Seifert, Ming Tang

TL;DR

Predicting cancer drug response remains challenging due to tumor heterogeneity and limited data. The authors adapt the DeepCDR framework by using scGPT-generated cell embeddings as a foundation-model-driven input, concatenated with graph-based drug representations to predict $IC_{50}$. Across Pearson correlation coefficient ($PCC$) metrics and leave-one-drug-out tests, the scGPT-enhanced model outperforms both the original DeepCDR and a scFoundation-based baseline and exhibits greater training stability. This work demonstrates the value of single-cell foundation models in CDR prediction and points to future gains from expanding foundation-model use to drug representations and additional omics data.

Abstract

AI-driven drug response prediction holds great promise for advancing personalized cancer treatment. However, the inherent heterogenity of cancer and high cost of data generation make accurate prediction challenging. In this study, we investigate whether incorporating the pretrained foundation model scGPT can enhance the performance of existing drug response prediction frameworks. Our approach builds on the DeepCDR framework, which encodes drug representations from graph structures and cell representations from multi-omics profiles. We adapt this framework by leveraging scGPT to generate enriched cell representations using its pretrained knowledge to compensate for limited amount of data. We evaluate our modified framework using IC$_{50}$ values on Pearson correlation coefficient (PCC) and a leave-one-drug out validation strategy, comparing it against the original DeepCDR framework and a prior scFoundation-based approach. scGPT not only outperforms previous approaches but also exhibits greater training stability, highlighting the value of leveraging scGPT-derived knowledge in this domain.

Integrating Single-Cell Foundation Models with Graph Neural Networks for Drug Response Prediction

TL;DR

Predicting cancer drug response remains challenging due to tumor heterogeneity and limited data. The authors adapt the DeepCDR framework by using scGPT-generated cell embeddings as a foundation-model-driven input, concatenated with graph-based drug representations to predict . Across Pearson correlation coefficient () metrics and leave-one-drug-out tests, the scGPT-enhanced model outperforms both the original DeepCDR and a scFoundation-based baseline and exhibits greater training stability. This work demonstrates the value of single-cell foundation models in CDR prediction and points to future gains from expanding foundation-model use to drug representations and additional omics data.

Abstract

AI-driven drug response prediction holds great promise for advancing personalized cancer treatment. However, the inherent heterogenity of cancer and high cost of data generation make accurate prediction challenging. In this study, we investigate whether incorporating the pretrained foundation model scGPT can enhance the performance of existing drug response prediction frameworks. Our approach builds on the DeepCDR framework, which encodes drug representations from graph structures and cell representations from multi-omics profiles. We adapt this framework by leveraging scGPT to generate enriched cell representations using its pretrained knowledge to compensate for limited amount of data. We evaluate our modified framework using IC values on Pearson correlation coefficient (PCC) and a leave-one-drug out validation strategy, comparing it against the original DeepCDR framework and a prior scFoundation-based approach. scGPT not only outperforms previous approaches but also exhibits greater training stability, highlighting the value of leveraging scGPT-derived knowledge in this domain.

Paper Structure

This paper contains 13 sections, 5 figures.

Figures (5)

  • Figure 1: Overview of our model framework. Drug structures are encoded as molecular graphs and processed by a graph neural network. Cell bulk expression data are embedded using a foundation model. The resulting drug and cell representations are concatenated and used to predict drug response (IC$_{50}$) via a Neural Net. To mitigate overfitting, we incorporate Dropout and Batch normalization. For classification sigmoid function was applied after the final layer.
  • Figure 2: Comparison of scGPT, scFoundation and Baseline results: PCC values between predicted and actual responses for (a) all cell lines, (b) all cancer types, (c) and all drugs in the test set were illustrated. Each dot representing a distinct cell line, cancer type, or drug. The red dashed identity line (y = x) serves as a reference for assessing relative model performance.
  • Figure 3: Relationship between predicted and observed IC$_{50}$ values for the best prediction case among cancer types, Low-Grade Gliomas (a), and drug types, Tubastatin A (b). Each dot represents a combination of a drug and a cell line, with the dashed red line indicating a perfect correlation between predicted and observed IC$_{50}$.
  • Figure 4: Comparison of models' performance in the leave-one-drug-out test: PCC value difference between two models—scGPT and scFoundation—compared to the baseline in a leave-one-drug-out test. Each dot represents one of the 20 randomly selected drugs from the dataset, with the y axis indicating the gained PCC values and the x axis representing the rank of drugs based on their improvement.
  • Figure 5: Comparison of PCC values per Epoch for scGPT-based model, scFoundation-based model, and the baseline over 20 training epochs (baseline was stopping early after 17 epochs).