Genotype-Phenotype Integration through Machine Learning and Personalized Gene Regulatory Networks for Cancer Metastasis Prediction

Jiwei Fu; Chunyu Yang

Genotype-Phenotype Integration through Machine Learning and Personalized Gene Regulatory Networks for Cancer Metastasis Prediction

Jiwei Fu, Chunyu Yang

TL;DR

Metastasis prediction remains challenging across cancer types and resource settings. The authors combine traditional ML benchmarks on CCLE expression data with personalized gene regulatory networks generated via PANDA and LIONESS, feeding these into a Graph Attention Network v2 to capture patient-specific regulatory patterns. XGBoost achieved the strongest performance (AUROC ≈ 0.705), while the GNN reached AUROC ≈ 0.642, illustrating complementary strengths and the limited topology signal in this dataset. The framework demonstrates feasibility for low-cost pancancer screening and provides a dual population- and patient-level approach to precision oncology that can guide resource allocation and future multiomics integration.

Abstract

Metastasis is the leading cause of cancer-related mortality, yet most predictive models rely on shallow architectures and neglect patient-specific regulatory mechanisms. Here, we integrate classical machine learning and deep learning to predict metastatic potential across multiple cancer types. Gene expression profiles from the Cancer Cell Line Encyclopedia were combined with a transcription factor-target prior from DoRothEA, focusing on nine metastasis-associated regulators. After selecting differential genes using the Kruskal-Wallis test, ElasticNet, Random Forest, and XGBoost models were trained for benchmarking. Personalized gene regulatory networks were then constructed using PANDA and LIONESS and analyzed through a graph attention neural network (GATv2) to learn topological and expression-based representations. While XGBoost achieved the highest AUROC (0.7051), the GNN captured non-linear regulatory dependencies at the patient level. These results demonstrate that combining traditional machine learning with graph-based deep learning enables a scalable and interpretable framework for metastasis risk prediction in precision oncology.

Genotype-Phenotype Integration through Machine Learning and Personalized Gene Regulatory Networks for Cancer Metastasis Prediction

TL;DR

Abstract

Genotype-Phenotype Integration through Machine Learning and Personalized Gene Regulatory Networks for Cancer Metastasis Prediction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)