X2Graph for Cancer Subtyping Prediction on Biological Tabular Data
Tu Bui, Mohamed Suliman, Aparajita Haldar, Mohammed Amer, Serban Georgescu
TL;DR
X2Graph introduces a KB-guided graph-transforming approach to cancer subtyping on small biological tabular datasets. By converting each row into a graph whose edges reflect prior knowledge and whose node features encode feature indices and values, the method leverages graph neural networks to mitigate overfitting in data-scarce settings. A late fusion of multiple KB-based models yields robust predictions across CNV, RNA, and Clinical data, with interpretability analyses linking top features to known cancer biology. The approach demonstrates state-of-the-art performance and offers a principled pathway to integrate external biological knowledge into tabular oncology data analyses.
Abstract
Despite the transformative impact of deep learning on text, audio, and image datasets, its dominance in tabular data, especially in the medical domain where data are often scarce, remains less clear. In this paper, we propose X2Graph, a novel deep learning method that achieves strong performance on small biological tabular datasets. X2Graph leverages external knowledge about the relationships between table columns, such as gene interactions, to convert each sample into a graph structure. This transformation enables the application of standard message passing algorithms for graph modeling. Our X2Graph method demonstrates superior performance compared to existing tree-based and deep learning methods across three cancer subtyping datasets.
