Identifying Key Genes in Cancer Networks Using Persistent Homology

Rodrigo Henrique Ramos; Yago Augusto Bardelotte; Cynthia de Oliveira Lage Ferreira; Adenilso Simao

Identifying Key Genes in Cancer Networks Using Persistent Homology

Rodrigo Henrique Ramos, Yago Augusto Bardelotte, Cynthia de Oliveira Lage Ferreira, Adenilso Simao

TL;DR

This work shows that cancer genes play an important role in higher-order structures, going beyond pairwise measures, and provides an approach to distinguish drivers and cancer-associated genes from passenger genes.

Abstract

Identifying driver genes is crucial for understanding oncogenesis and developing targeted cancer therapies. Driver discovery methods using protein or pathway networks rely on traditional network science measures, focusing on nodes, edges, or community metrics. These methods can overlook the high-dimensional interactions that cancer genes have within cancer networks. This study presents a novel method using Persistent Homology to analyze the role of driver genes in higher-order structures within Cancer Consensus Networks derived from main cellular pathways. We integrate mutation data from six cancer types and three biological functions: DNA Repair, Chromatin Organization, and Programmed Cell Death. We systematically evaluated the impact of gene removal on topological voids ($β_2$ structures) within the Cancer Consensus Networks. Our results reveal that only known driver genes and cancer-associated genes influence these structures, while passenger genes do not. Although centrality measures alone proved insufficient to fully characterize impact genes, combining higher-order topological analysis with traditional network metrics can improve the precision of distinguishing between drivers and passengers. This work shows that cancer genes play an important role in higher-order structures, going beyond pairwise measures, and provides an approach to distinguish drivers and cancer-associated genes from passenger genes.

Identifying Key Genes in Cancer Networks Using Persistent Homology

TL;DR

Abstract

structures) within the Cancer Consensus Networks. Our results reveal that only known driver genes and cancer-associated genes influence these structures, while passenger genes do not. Although centrality measures alone proved insufficient to fully characterize impact genes, combining higher-order topological analysis with traditional network metrics can improve the precision of distinguishing between drivers and passengers. This work shows that cancer genes play an important role in higher-order structures, going beyond pairwise measures, and provides an approach to distinguish drivers and cancer-associated genes from passenger genes.

Paper Structure (10 sections, 5 figures, 2 tables)

This paper contains 10 sections, 5 figures, 2 tables.

Introduction
Cancer Mutation Data and Reactome's Super Pathways
Persistence Homology
Persistence, Barcodes and Betti Numbers
Persistence Homology in Cancer Studies
Methods
Result and Discussion
Impact on $\beta_2$ by single node removal
Impacting genes and centrality measures
Conclusion

Figures (5)

Figure 1: From network to persistence barcodes.
Figure 2: Data pipeline: We gather data from different databases to create Cancer Consensus Networks (CCNs), integrating data from three main biological functions with cancer-specific information. After that, we analyse the topological role of drivers and non-drivers in relation to their impact on higher-order structures.
Figure 3: Centrality distributions for CCNs. Grey points represent genes whose removal does not affect $\beta_2$. Red and blue points indicate genes whose removal reduces $\beta_2$, with red points representing known drivers and blue points genes associated with cancer.
Figure 4: (A) Shows examples of simplexes of dimensions 0, 1, 2 and 3. (B) Presents the Vietoris-Rips filtration for a point cloud consisting of four equidistant points and the Persistent Barcode capturing the birth and death of topological structures.
Figure 5: The Betti number $\beta_1$ for the complexes $K_0$ and $K_1$. Observe that the only element in $C_1$ that lies within the kernel of $\partial_1$ is $[0,1] + [1,2] + [2,0]$, leading to $dim(Z_1(K_0)) = dim(Z_1(K_1)) = 1$. Additionally, note that the boundary of the simplex $[0,1,2]$ forms the oriented triangle $[1,2]-[0,2]+[0,1]$. As a result, $dim(B_1(K_0)) = 0$, $dim(B_1(K_1)) = 1$, yielding $b_1(K_0) = 1$ and $b_1(K_1) = 0$. This indicates that $K_0$ contains a cycle, whereas $K_1$ does not.

Identifying Key Genes in Cancer Networks Using Persistent Homology

TL;DR

Abstract

Identifying Key Genes in Cancer Networks Using Persistent Homology

Authors

TL;DR

Abstract

Table of Contents

Figures (5)