Table of Contents
Fetching ...

TopER: Topological Embeddings in Graph Representation Learning

Astrit Tola, Funmilola Mary Taiwo, Cuneyt Gurcan Akcora, Baris Coskunuzer

TL;DR

TopER addresses the need for interpretable and scalable graph representations by replacing costly persistence diagram computations with a two-parameter topological evolution rate TE_f(\mathcal{G},\mathcal{I})=(a,b) derived from a simple graph filtration. By fitting a line to counts of nodes and edges across the filtration, TopER yields a compact 2D embedding that preserves topological growth patterns and supports intuitive visualization. Empirically, it achieves competitive graph classification and clustering across molecular, biological, and social datasets and offers stability guarantees under filtration perturbations. This enables scalable graph analysis with interpretable structure and opens avenues for integration into graph foundation models and visual analytics.

Abstract

Graph embeddings play a critical role in graph representation learning, allowing machine learning models to explore and interpret graph-structured data. However, existing methods often rely on opaque, high-dimensional embeddings, limiting interpretability and practical visualization. In this work, we introduce Topological Evolution Rate (TopER), a novel, low-dimensional embedding approach grounded in topological data analysis. TopER simplifies a key topological approach, Persistent Homology, by calculating the evolution rate of graph substructures, resulting in intuitive and interpretable visualizations of graph data. This approach not only enhances the exploration of graph datasets but also delivers competitive performance in graph clustering and classification tasks. Our TopER-based models achieve or surpass state-of-the-art results across molecular, biological, and social network datasets in tasks such as classification, clustering, and visualization.

TopER: Topological Embeddings in Graph Representation Learning

TL;DR

TopER addresses the need for interpretable and scalable graph representations by replacing costly persistence diagram computations with a two-parameter topological evolution rate TE_f(\mathcal{G},\mathcal{I})=(a,b) derived from a simple graph filtration. By fitting a line to counts of nodes and edges across the filtration, TopER yields a compact 2D embedding that preserves topological growth patterns and supports intuitive visualization. Empirically, it achieves competitive graph classification and clustering across molecular, biological, and social datasets and offers stability guarantees under filtration perturbations. This enables scalable graph analysis with interpretable structure and opens avenues for integration into graph foundation models and visual analytics.

Abstract

Graph embeddings play a critical role in graph representation learning, allowing machine learning models to explore and interpret graph-structured data. However, existing methods often rely on opaque, high-dimensional embeddings, limiting interpretability and practical visualization. In this work, we introduce Topological Evolution Rate (TopER), a novel, low-dimensional embedding approach grounded in topological data analysis. TopER simplifies a key topological approach, Persistent Homology, by calculating the evolution rate of graph substructures, resulting in intuitive and interpretable visualizations of graph data. This approach not only enhances the exploration of graph datasets but also delivers competitive performance in graph clustering and classification tasks. Our TopER-based models achieve or surpass state-of-the-art results across molecular, biological, and social network datasets in tasks such as classification, clustering, and visualization.
Paper Structure (33 sections, 4 theorems, 13 equations, 10 figures, 15 tables, 1 algorithm)

This paper contains 33 sections, 4 theorems, 13 equations, 10 figures, 15 tables, 1 algorithm.

Key Result

Theorem 3.4

Let $\mathcal{X}$ be a compact metric space, and $f,g:\mathcal{X}\to\mathbb{R}$ be two filtration functions. Then, for some $C>0$,

Figures (10)

  • Figure 1: TopER Visualizations. Each data point represents an individual graph. On the left, TopER is applied to three benchmark compound datasets using closeness sublevel filtration. The middle panel zooms in on the red point cloud from the left, demonstrating TopER's effectiveness in distinguishing between classes within the MUTAG dataset. On the right, a TopER visualization for the IMDB-B dataset is displayed.
  • Figure 2: Filtration. For $\mathcal{G}=\mathcal{G}_3$ in both examples, the top figure illustrates superlevel filtration with node degree function for thresholds $\{1,2,3\}$. Similarly, the bottom figure illustrates sublevel filtration for edge weights with thresholds $\{1.5,1.8,2.1\}$.
  • Figure 3: TopER steps. The filtration process on three different graphs using node or edge filtration. The graphs undergo filtration, and for each graph, a best-fit line is determined through the filtration data. The coefficients of these best-fit lines are then used as descriptors for the graphs.
  • Figure 4: TopER visualizations of the PROTEINS dataset with O.Ricci edge filtration, and the BZR dataset with degree centrality node filtration. Each point corresponds to an individual graph.
  • Figure 5: Scalability. TopER run time for synthetic power law graphs holme2002growing with node degree filtration. The mean node degree is $30$, and 100 filtration steps are used.
  • ...and 5 more figures

Theorems & Definitions (9)

  • Definition 3.1: Topological Evolution Rate (TopER)
  • Remark 3.2: Why a Linear Fit?
  • Remark 3.3: On the Name TopER
  • Theorem 3.4
  • Corollary 3.5
  • Lemma 3.6
  • Lemma 3.7
  • proof
  • proof