Table of Contents
Fetching ...

What Can We Learn from State Space Models for Machine Learning on Graphs?

Yinan Huang, Siqi Miao, Pan Li

TL;DR

This work proposes Graph State Space Convolution (GSSC) as a principled extension of SSMs to graph-structured data and demonstrates the provably stronger expressiveness of GSSC than MPNNs in counting graph substructures and shows its effectiveness across 11 real-world, widely used benchmark datasets.

Abstract

Machine learning on graphs has recently found extensive applications across domains. However, the commonly used Message Passing Neural Networks (MPNNs) suffer from limited expressive power and struggle to capture long-range dependencies. Graph transformers offer a strong alternative due to their global attention mechanism, but they come with great computational overheads, especially for large graphs. In recent years, State Space Models (SSMs) have emerged as a compelling approach to replace full attention in transformers to model sequential data. It blends the strengths of RNNs and CNNs, offering a) efficient computation, b) the ability to capture long-range dependencies, and c) good generalization across sequences of various lengths. However, extending SSMs to graph-structured data presents unique challenges due to the lack of canonical node ordering in graphs. In this work, we propose Graph State Space Convolution (GSSC) as a principled extension of SSMs to graph-structured data. By leveraging global permutation-equivariant set aggregation and factorizable graph kernels that rely on relative node distances as the convolution kernels, GSSC preserves all three advantages of SSMs. We demonstrate the provably stronger expressiveness of GSSC than MPNNs in counting graph substructures and show its effectiveness across 11 real-world, widely used benchmark datasets. GSSC achieves the best results on 6 out of 11 datasets with all significant improvements compared to the state-of-the-art baselines and second-best results on the other 5 datasets. Our findings highlight the potential of GSSC as a powerful and scalable model for graph machine learning. Our code is available at https://github.com/Graph-COM/GSSC.

What Can We Learn from State Space Models for Machine Learning on Graphs?

TL;DR

This work proposes Graph State Space Convolution (GSSC) as a principled extension of SSMs to graph-structured data and demonstrates the provably stronger expressiveness of GSSC than MPNNs in counting graph substructures and shows its effectiveness across 11 real-world, widely used benchmark datasets.

Abstract

Machine learning on graphs has recently found extensive applications across domains. However, the commonly used Message Passing Neural Networks (MPNNs) suffer from limited expressive power and struggle to capture long-range dependencies. Graph transformers offer a strong alternative due to their global attention mechanism, but they come with great computational overheads, especially for large graphs. In recent years, State Space Models (SSMs) have emerged as a compelling approach to replace full attention in transformers to model sequential data. It blends the strengths of RNNs and CNNs, offering a) efficient computation, b) the ability to capture long-range dependencies, and c) good generalization across sequences of various lengths. However, extending SSMs to graph-structured data presents unique challenges due to the lack of canonical node ordering in graphs. In this work, we propose Graph State Space Convolution (GSSC) as a principled extension of SSMs to graph-structured data. By leveraging global permutation-equivariant set aggregation and factorizable graph kernels that rely on relative node distances as the convolution kernels, GSSC preserves all three advantages of SSMs. We demonstrate the provably stronger expressiveness of GSSC than MPNNs in counting graph substructures and show its effectiveness across 11 real-world, widely used benchmark datasets. GSSC achieves the best results on 6 out of 11 datasets with all significant improvements compared to the state-of-the-art baselines and second-best results on the other 5 datasets. Our findings highlight the potential of GSSC as a powerful and scalable model for graph machine learning. Our code is available at https://github.com/Graph-COM/GSSC.
Paper Structure (21 sections, 3 theorems, 10 equations, 4 figures, 10 tables)

This paper contains 21 sections, 3 theorems, 10 equations, 4 figures, 10 tables.

Key Result

Proposition 3.1

There exists $\phi$ such that for GSSC Eq. gssm, the gradient norm $\norm{\partial h_u/\partial x_v}$ does not decay as $spd(u,v)$ grows, where $spd$ denotes the shortest path distance.

Figures (4)

  • Figure 1: Comparison of Sequence State Space Conv. (left) and Graph State Space Conv. (right).
  • Figure 2: Illustration of Graph State Space Convolution (GSSC).
  • Figure 3: Preprocessing costs per graph.
  • Figure 4: Model training and inference costs per graph.

Theorems & Definitions (7)

  • Proposition 3.1
  • Remark 3.1
  • Proposition 3.2
  • Proposition 3.3: Counting paths and cycles
  • proof
  • proof
  • proof