Table of Contents
Fetching ...

Cayley Graph Propagation

JJ Wilson, Maya Bechler-Speicher, Petar Veličković

TL;DR

The paper addresses over-squashing in graph neural networks by adopting a complete Cayley graph based propagation scheme, CGP, which interleaves message passing on the input graph with global communication on the complete Cayley graph of SL(2, Z_n). By retaining all Cayley nodes (including virtual ones) and avoiding truncation, CGP achieves bottleneck-free information flow, evidenced by improved expansion properties and reduced diameters. Empirical results across OGB, TUDataset, and Long Range Graph Benchmark show CGP outperforms Expander Graph Propagation and several graph-rewiring baselines while maintaining favorable runtime and scalability. The work demonstrates that a theoretically grounded Cayley-graph template can provide practical gains for long-range information integration in real-world graph tasks, with future directions including task-aligned Cayley edges and temporal graph settings.

Abstract

In spite of the plethora of success stories with graph neural networks (GNNs) on modelling graph-structured data, they are notoriously vulnerable to over-squashing, whereby tasks necessitate the mixing of information between distance pairs of nodes. To address this problem, prior work suggests rewiring the graph structure to improve information flow. Alternatively, a significant body of research has dedicated itself to discovering and precomputing bottleneck-free graph structures to ameliorate over-squashing. One well regarded family of bottleneck-free graphs within the mathematical community are expander graphs, with prior work -- Expander Graph Propagation (EGP) -- proposing the use of a well-known expander graph family -- the Cayley graphs of the $\mathrm{SL}(2,\mathbb{Z}_n)$ special linear group -- as a computational template for GNNs. However, in EGP the computational graphs used are truncated to align with a given input graph. In this work, we show that truncation is detrimental to the coveted expansion properties. Instead, we propose CGP, a method to propagate information over a complete Cayley graph structure, thereby ensuring it is bottleneck-free to better alleviate over-squashing. Our empirical evidence across several real-world datasets not only shows that CGP recovers significant improvements as compared to EGP, but it is also akin to or outperforms computationally complex graph rewiring techniques.

Cayley Graph Propagation

TL;DR

The paper addresses over-squashing in graph neural networks by adopting a complete Cayley graph based propagation scheme, CGP, which interleaves message passing on the input graph with global communication on the complete Cayley graph of SL(2, Z_n). By retaining all Cayley nodes (including virtual ones) and avoiding truncation, CGP achieves bottleneck-free information flow, evidenced by improved expansion properties and reduced diameters. Empirical results across OGB, TUDataset, and Long Range Graph Benchmark show CGP outperforms Expander Graph Propagation and several graph-rewiring baselines while maintaining favorable runtime and scalability. The work demonstrates that a theoretically grounded Cayley-graph template can provide practical gains for long-range information integration in real-world graph tasks, with future directions including task-aligned Cayley edges and temporal graph settings.

Abstract

In spite of the plethora of success stories with graph neural networks (GNNs) on modelling graph-structured data, they are notoriously vulnerable to over-squashing, whereby tasks necessitate the mixing of information between distance pairs of nodes. To address this problem, prior work suggests rewiring the graph structure to improve information flow. Alternatively, a significant body of research has dedicated itself to discovering and precomputing bottleneck-free graph structures to ameliorate over-squashing. One well regarded family of bottleneck-free graphs within the mathematical community are expander graphs, with prior work -- Expander Graph Propagation (EGP) -- proposing the use of a well-known expander graph family -- the Cayley graphs of the special linear group -- as a computational template for GNNs. However, in EGP the computational graphs used are truncated to align with a given input graph. In this work, we show that truncation is detrimental to the coveted expansion properties. Instead, we propose CGP, a method to propagate information over a complete Cayley graph structure, thereby ensuring it is bottleneck-free to better alleviate over-squashing. Our empirical evidence across several real-world datasets not only shows that CGP recovers significant improvements as compared to EGP, but it is also akin to or outperforms computationally complex graph rewiring techniques.
Paper Structure (25 sections, 6 equations, 6 figures, 9 tables)

This paper contains 25 sections, 6 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 3: Comparison of the total effective resistance $R_{tot}$ for CGP against the baseline model and EGP. A lower total effective resistance indicates that a graph is less susceptible to over-squashing.
  • Figure 4: The mean norm of the virtual node embeddings for CGP using different initialisation strategies on the TUDataset, including ZEROS, ONES, RANDOM and $\theta$.
  • Figure 5: The variance in the norm of the virtual node embeddings for CGP using different initialisation strategies on the TUDataset, including ZEROS, ONES, RANDOM and $\theta$.
  • Figure 6: Synthetic preprocessing benchmark for CGP in regards to graph rewiring techniques, using Erdős–Rényi graphs with a probability $p = \frac{5 \log n}{n}$. Left: Preprocessing time of CGP against DIGL, FoSR and GTR. Right: Preprocessing time of CGP against SDRF.
  • Figure 7: The learning curves of the same GNN model trained on graphs that have the same node features and only differ in their graph structure, which is sampled from different distributions. The label is computed from the node features without the use of any graph structure. The GNN overfits the graph structure instead of ignoring it, and therefore the model performance differ across different graph distributions. Cayley graphs exhibit the best performance, and robustness to overfitting.
  • ...and 1 more figures