Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

Tianshu Feng; Rohan Gnanaolivu; Abolfazl Safikhani; Yuanhang Liu; Jun Jiang; Nicholas Chia; Alexander Partin; Priyanka Vasanthakumari; Yitan Zhu; Chen Wang

Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

Tianshu Feng, Rohan Gnanaolivu, Abolfazl Safikhani, Yuanhang Liu, Jun Jiang, Nicholas Chia, Alexander Partin, Priyanka Vasanthakumari, Yitan Zhu, Chen Wang

TL;DR

VETE presents a variational, GO-guided encoder for cancer transcriptomics that jointly encodes gene expression and drug structure into a probabilistic latent space, enabling robust cancer type classification and drug response prediction with traceable biological explanations. The framework combines a hierarchical GO-based transcriptomics subnetwork, a drug-embedding module, a variational bottleneck, and a graph-based local explanation method (GIG) with Sankey visualization, all optimized via large-scale hyperparameter tuning. Empirically, VETE outperforms standard baselines on cancer-type classification and drug-response prediction and provides biologically meaningful GO-term pathways that align with known cancer biology, as demonstrated on GDSC, TCGA, and CCLE-derived data. Overall, VETE links AI predictions to mechanistic biology, offering interpretable, scalable insights for precision oncology and drug discovery.

Abstract

Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological interpretability. To overcome these limitations, we introduce VETE (Variational and Explanatory Transcriptomics Encoder), a novel neural network framework that incorporates a variational component to mitigate noise effects and integrates traceable gene ontology into the neural network architecture for encoding cancer transcriptomics data. Key innovations include a local interpretability-guided method for identifying ontology paths, a visualization tool to elucidate biological mechanisms of drug responses, and the application of centralized large scale hyperparameter optimization. VETE demonstrated robust accuracy in cancer cell line classification and drug response prediction. Additionally, it provided traceable biological explanations for both tasks and offers insights into the mechanisms underlying its predictions. VETE bridges the gap between AI-driven predictions and biologically meaningful insights in cancer research, which represents a promising advancement in the field.

Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

TL;DR

Abstract

Paper Structure (13 sections, 3 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 13 sections, 3 equations, 8 figures, 1 table, 1 algorithm.

Introduction
Methodology
Data Preparation
Variational and Explanatory Transcriptomics Encoder for Biological Graphs
Identifying Critical Hierarchical Paths with Local Model Explanation Techniques
Visualization with Sankey Plot
Large Scale Hyper Parameter Optimization
Experiments
Model Training Tasks
VETE for Cell Line Classification
VETE for Drug Response Prediction
Discussion
VETE for Cancer Type Prediction with TCGA

Figures (8)

Figure 1: Model design: (a) Overall VETE framework; (b) Model structure for drug-cell embedding and downstream tasks; (c) Hierarchical neural network (HNN) following the hierarchical structure of an ontology of cellular subsystems to encode gene expression of samples.
Figure 2: Cell line classification results: (a) ROC curve for VETE, MLP, XGBoost, and Random Forest; (b) Comparison of VETE, MLP, XGBoost, and Random Forest using AUC-ROC.
Figure 3: Drug response prediction results: (a) predicted vs. actual drug responses across all drug-cell pairs; (b) performance comparison of VETE again MLP, XGBoost, and Random Forest.
Figure 4: Top 10 drugs with the highest drug response prediction accuracy.
Figure 5: Hierarchical explanation of VETE for: (a) Docetaxel-OV pair; (b) Docetaxel-BRCA pair. Blue squares highlight selected shared GO terms between the two pairs, while red squares highlight selected unique GO terms.
...and 3 more figures

Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

TL;DR

Abstract

Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

Authors

TL;DR

Abstract

Table of Contents

Figures (8)