GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation

Ziwei Yang; Zheng Chen; Xin Liu; Rikuto Kotoge; Peng Chen; Yasuko Matsubara; Yasushi Sakurai; Jimeng Sun

GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation

Ziwei Yang, Zheng Chen, Xin Liu, Rikuto Kotoge, Peng Chen, Yasuko Matsubara, Yasushi Sakurai, Jimeng Sun

TL;DR

GeSubNet addresses the challenge of deriving disease subtype-specific gene networks by learning a unified representation that combines patient gene expression with prior knowledge graphs. The framework employs three modules—Patient-M (VQ-VAE-based subtype encoding), Graph-M (Neo-GNN-based prior-network encoding), and Infer-M (integration for subtype-specific network generation)—to produce sparse, biologically meaningful subtype graphs. Experimental results across four TCGA cancer types show substantial gains over baselines in graph similarity and diversity metrics, along with GO-enrichment and a novel gene knockout analysis indicating strong biological relevance (e.g., an 83% shift likelihood for high-ranking genes in BRCA). This work demonstrates that integrating experimental data with curated gene networks can produce targeted networks that reflect subtype biology, with potential implications for biomarker discovery and precision oncology.

Abstract

Retrieving gene functional networks from knowledge databases presents a challenge due to the mismatch between disease networks and subtype-specific variations. Current solutions, including statistical and deep learning methods, often fail to effectively integrate gene interaction knowledge from databases or explicitly learn subtype-specific interactions. To address this mismatch, we propose GeSubNet, which learns a unified representation capable of predicting gene interactions while distinguishing between different disease subtypes. Graphs generated by such representations can be considered subtype-specific networks. GeSubNet is a multi-step representation learning framework with three modules: First, a deep generative model learns distinct disease subtypes from patient gene expression profiles. Second, a graph neural network captures representations of prior gene networks from knowledge databases, ensuring accurate physical gene interactions. Finally, we integrate these two representations using an inference loss that leverages graph generation capabilities, conditioned on the patient separation loss, to refine subtype-specific information in the learned representation. GeSubNet consistently outperforms traditional methods, with average improvements of 30.6%, 21.0%, 20.1%, and 56.6% across four graph evaluation metrics, averaged over four cancer datasets. Particularly, we conduct a biological simulation experiment to assess how the behavior of selected genes from over 11,000 candidates affects subtypes or patient distributions. The results show that the generated network has the potential to identify subtype-specific genes with an 83% likelihood of impacting patient distribution shifts.

GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation

TL;DR

Abstract

GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)