Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses
Tianshu Feng, Rohan Gnanaolivu, Abolfazl Safikhani, Yuanhang Liu, Jun Jiang, Nicholas Chia, Alexander Partin, Priyanka Vasanthakumari, Yitan Zhu, Chen Wang
TL;DR
VETE presents a variational, GO-guided encoder for cancer transcriptomics that jointly encodes gene expression and drug structure into a probabilistic latent space, enabling robust cancer type classification and drug response prediction with traceable biological explanations. The framework combines a hierarchical GO-based transcriptomics subnetwork, a drug-embedding module, a variational bottleneck, and a graph-based local explanation method (GIG) with Sankey visualization, all optimized via large-scale hyperparameter tuning. Empirically, VETE outperforms standard baselines on cancer-type classification and drug-response prediction and provides biologically meaningful GO-term pathways that align with known cancer biology, as demonstrated on GDSC, TCGA, and CCLE-derived data. Overall, VETE links AI predictions to mechanistic biology, offering interpretable, scalable insights for precision oncology and drug discovery.
Abstract
Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological interpretability. To overcome these limitations, we introduce VETE (Variational and Explanatory Transcriptomics Encoder), a novel neural network framework that incorporates a variational component to mitigate noise effects and integrates traceable gene ontology into the neural network architecture for encoding cancer transcriptomics data. Key innovations include a local interpretability-guided method for identifying ontology paths, a visualization tool to elucidate biological mechanisms of drug responses, and the application of centralized large scale hyperparameter optimization. VETE demonstrated robust accuracy in cancer cell line classification and drug response prediction. Additionally, it provided traceable biological explanations for both tasks and offers insights into the mechanisms underlying its predictions. VETE bridges the gap between AI-driven predictions and biologically meaningful insights in cancer research, which represents a promising advancement in the field.
