Table of Contents
Fetching ...

WISER: Weak supervISion and supErvised Representation learning to improve drug response prediction in cancer

Kumar Shubham, Aishwarya Jayagopal, Syed Mohammed Danish, Prathosh AP, Vaibhav Rajan

TL;DR

WISER tackles the mismatch between preclinical cell-line data and patient drug responses by learning a domain-invariant representation that integrates drug-response information via discrete drug embeddings. It introduces weak supervision and a novel subset selection strategy to leverage abundant unlabeled patient data alongside limited labeled cell-line data. The approach demonstrates superior drug-response prediction on real patient data across several anti-cancer drugs, with notable AUROC and AUPRC gains and interpretability via gene-level validation. By linking validated gene signatures to predicted responses, WISER provides a principled and potentially clinically actionable framework for personalized cancer therapy.

Abstract

Cancer, a leading cause of death globally, occurs due to genomic changes and manifests heterogeneously across patients. To advance research on personalized treatment strategies, the effectiveness of various drugs on cells derived from cancers (`cell lines') is experimentally determined in laboratory settings. Nevertheless, variations in the distribution of genomic data and drug responses between cell lines and humans arise due to biological and environmental differences. Moreover, while genomic profiles of many cancer patients are readily available, the scarcity of corresponding drug response data limits the ability to train machine learning models that can predict drug response in patients effectively. Recent cancer drug response prediction methods have largely followed the paradigm of unsupervised domain-invariant representation learning followed by a downstream drug response classification step. Introducing supervision in both stages is challenging due to heterogeneous patient response to drugs and limited drug response data. This paper addresses these challenges through a novel representation learning method in the first phase and weak supervision in the second. Experimental results on real patient data demonstrate the efficacy of our method (WISER) over state-of-the-art alternatives on predicting personalized drug response.

WISER: Weak supervISion and supErvised Representation learning to improve drug response prediction in cancer

TL;DR

WISER tackles the mismatch between preclinical cell-line data and patient drug responses by learning a domain-invariant representation that integrates drug-response information via discrete drug embeddings. It introduces weak supervision and a novel subset selection strategy to leverage abundant unlabeled patient data alongside limited labeled cell-line data. The approach demonstrates superior drug-response prediction on real patient data across several anti-cancer drugs, with notable AUROC and AUPRC gains and interpretability via gene-level validation. By linking validated gene signatures to predicted responses, WISER provides a principled and potentially clinically actionable framework for personalized cancer therapy.

Abstract

Cancer, a leading cause of death globally, occurs due to genomic changes and manifests heterogeneously across patients. To advance research on personalized treatment strategies, the effectiveness of various drugs on cells derived from cancers (`cell lines') is experimentally determined in laboratory settings. Nevertheless, variations in the distribution of genomic data and drug responses between cell lines and humans arise due to biological and environmental differences. Moreover, while genomic profiles of many cancer patients are readily available, the scarcity of corresponding drug response data limits the ability to train machine learning models that can predict drug response in patients effectively. Recent cancer drug response prediction methods have largely followed the paradigm of unsupervised domain-invariant representation learning followed by a downstream drug response classification step. Introducing supervision in both stages is challenging due to heterogeneous patient response to drugs and limited drug response data. This paper addresses these challenges through a novel representation learning method in the first phase and weak supervision in the second. Experimental results on real patient data demonstrate the efficacy of our method (WISER) over state-of-the-art alternatives on predicting personalized drug response.
Paper Structure (31 sections, 8 equations, 2 figures, 15 tables, 1 algorithm)

This paper contains 31 sections, 8 equations, 2 figures, 15 tables, 1 algorithm.

Figures (2)

  • Figure 1: This diagram outlines WISER's comprehensive training process, divided into four key phases. First, in the Representation Learning phase, a domain-invariant representation (${\mathcal{Z}}$) is learned between cell line and patient genomic profiles using a shared encoder and private encoding scheme. Next, in the Weak Supervision phase, multiple label functions are trained using labeled genomic profiles of cell lines to assign pseudo labels to unlabeled patient genomic profiles. Following that, in the Subset Selection phase, pseudo labels and the domain-invariant representation (${\mathcal{Z}}$) are used to select a subset of patient genomic profiles (${\mathcal{D}}_{patient}^{sub}$) and associated pseudo labels based on the consistency of the labels among nearest neighbors. Finally, in the Drug Response Prediction phase, the selected subset, along with labeled genomic profiles from cell lines, is utilized for downstream classifier training and predicting drug responses among patients.
  • Figure 2: Ablation on weak supervision and sensitivity test on subset size over the performance of the model.