Table of Contents
Fetching ...

CliquePH: Higher-Order Information for Graph Neural Networks through Persistent Homology on Clique Graphs

Davide Buffelli, Farzin Soleymani, Bastian Rieck

TL;DR

This work introduces a novel method that extracts information about higher-order structures in the graph while still using the efficient low-dimensional persistent homology algorithm, and shows that this method can lead to up to $31\%$ improvements in test accuracy.

Abstract

Graph neural networks have become the default choice by practitioners for graph learning tasks such as graph classification and node classification. Nevertheless, popular graph neural network models still struggle to capture higher-order information, i.e., information that goes \emph{beyond} pairwise interactions. Recent work has shown that persistent homology, a tool from topological data analysis, can enrich graph neural networks with topological information that they otherwise could not capture. Calculating such features is efficient for dimension 0 (connected components) and dimension 1 (cycles). However, when it comes to higher-order structures, it does not scale well, with a complexity of $O(n^d)$, where $n$ is the number of nodes and $d$ is the order of the structures. In this work, we introduce a novel method that extracts information about higher-order structures in the graph while still using the efficient low-dimensional persistent homology algorithm. On standard benchmark datasets, we show that our method can lead to up to $31\%$ improvements in test accuracy.

CliquePH: Higher-Order Information for Graph Neural Networks through Persistent Homology on Clique Graphs

TL;DR

This work introduces a novel method that extracts information about higher-order structures in the graph while still using the efficient low-dimensional persistent homology algorithm, and shows that this method can lead to up to improvements in test accuracy.

Abstract

Graph neural networks have become the default choice by practitioners for graph learning tasks such as graph classification and node classification. Nevertheless, popular graph neural network models still struggle to capture higher-order information, i.e., information that goes \emph{beyond} pairwise interactions. Recent work has shown that persistent homology, a tool from topological data analysis, can enrich graph neural networks with topological information that they otherwise could not capture. Calculating such features is efficient for dimension 0 (connected components) and dimension 1 (cycles). However, when it comes to higher-order structures, it does not scale well, with a complexity of , where is the number of nodes and is the order of the structures. In this work, we introduce a novel method that extracts information about higher-order structures in the graph while still using the efficient low-dimensional persistent homology algorithm. On standard benchmark datasets, we show that our method can lead to up to improvements in test accuracy.
Paper Structure (32 sections, 3 theorems, 6 equations, 5 figures, 10 tables)

This paper contains 32 sections, 3 theorems, 6 equations, 5 figures, 10 tables.

Key Result

Theorem 4.1

Persistent homology is at least as expressive as the 1-WL algorithm, i.e. if the 1-WL label sequences for two graphs $G$ and $G^\prime$ diverge, there exists an injective filtration $f$ such that the corresponding 0-dimensional persistence diagrams $D_{G}^{(0)}$ and $D_{G^\prime}^{(0)}$ are not equa

Figures (5)

  • Figure 1: Overview of our method. CliquePH is composed of four stages. (1) First we "lift" the original graph by extracting its clique graphs. (2) We construct node embeddings using a graph neural network. (3) We use learnable functions to generate filtration values for all nodes and edges in all the lifted graphs. We then perform persistent homology (up to dimension 1) on all lifted graphs. (4) We incorporate the information from persistent homology and message-passing into a single representation. The whole model is trained end-to-end.
  • Figure 2: Ablation Results for different architectures. (Left) Test accuracy for the structure-based experiments while changing the position of CliquePH layer. (Center) Test accuracy for the structure-based experiments while increasing the total number of layers in the model. Error bars show standard deviation over 10 runs. (Right) Test accuracy for the structure-based experiments while changing the maximum dimension of the lifted clique graphs of CliquePH.
  • Figure 3: Comparison of test accuracy for the structure-based experiments while changing the position of CliquePH layer in the GCN model architecture. Results averaged over 10 runs.
  • Figure 4: Comparison of test accuracy for the structure-based experiments while increasing the total number of layers in the model architecture. Error bars show the standard deviation over 10 runs.
  • Figure 5: Comparison of test accuracy for the structure-based experiments while changing the maximum dimension of the lifted clique graphs of CliquePH. Results averaged over 10 runs.

Theorems & Definitions (3)

  • Theorem 4.1: Persistent Homology is at least as expressive as the 1-WL - Thm. 2 in Horn22a
  • Theorem 4.2: $k$-dimensional Persistent Homology is as expressive as $k$-WL - Thm. 3 in rieck2023expressivity
  • Theorem A.1