What Can We Learn from State Space Models for Machine Learning on Graphs?

Yinan Huang; Siqi Miao; Pan Li

What Can We Learn from State Space Models for Machine Learning on Graphs?

Yinan Huang, Siqi Miao, Pan Li

TL;DR

This work proposes Graph State Space Convolution (GSSC) as a principled extension of SSMs to graph-structured data and demonstrates the provably stronger expressiveness of GSSC than MPNNs in counting graph substructures and shows its effectiveness across 11 real-world, widely used benchmark datasets.

Abstract

Machine learning on graphs has recently found extensive applications across domains. However, the commonly used Message Passing Neural Networks (MPNNs) suffer from limited expressive power and struggle to capture long-range dependencies. Graph transformers offer a strong alternative due to their global attention mechanism, but they come with great computational overheads, especially for large graphs. In recent years, State Space Models (SSMs) have emerged as a compelling approach to replace full attention in transformers to model sequential data. It blends the strengths of RNNs and CNNs, offering a) efficient computation, b) the ability to capture long-range dependencies, and c) good generalization across sequences of various lengths. However, extending SSMs to graph-structured data presents unique challenges due to the lack of canonical node ordering in graphs. In this work, we propose Graph State Space Convolution (GSSC) as a principled extension of SSMs to graph-structured data. By leveraging global permutation-equivariant set aggregation and factorizable graph kernels that rely on relative node distances as the convolution kernels, GSSC preserves all three advantages of SSMs. We demonstrate the provably stronger expressiveness of GSSC than MPNNs in counting graph substructures and show its effectiveness across 11 real-world, widely used benchmark datasets. GSSC achieves the best results on 6 out of 11 datasets with all significant improvements compared to the state-of-the-art baselines and second-best results on the other 5 datasets. Our findings highlight the potential of GSSC as a powerful and scalable model for graph machine learning. Our code is available at https://github.com/Graph-COM/GSSC.

What Can We Learn from State Space Models for Machine Learning on Graphs?

TL;DR

Abstract

Paper Structure (21 sections, 3 theorems, 10 equations, 4 figures, 10 tables)

This paper contains 21 sections, 3 theorems, 10 equations, 4 figures, 10 tables.

Introduction
Preliminaries
Graph State Space Convolution
Generalizing State Space Convolution to Graphs
Extensions and Discussions
Expressive Power
Related Works
Experiments
Graph Substructure Counting
Graph Learning Benchmarks
Computational Costs Comparison
Conclusion and Limitations
Deferred Proofs
Proof of Proposition \ref{['theorem-long-range']}
Proof of Proposition \ref{['theorem-wl']}
...and 6 more sections

Key Result

Proposition 3.1

There exists $\phi$ such that for GSSC Eq. gssm, the gradient norm $\norm{\partial h_u/\partial x_v}$ does not decay as $spd(u,v)$ grows, where $spd$ denotes the shortest path distance.

Figures (4)

Figure 1: Comparison of Sequence State Space Conv. (left) and Graph State Space Conv. (right).
Figure 2: Illustration of Graph State Space Convolution (GSSC).
Figure 3: Preprocessing costs per graph.
Figure 4: Model training and inference costs per graph.

Theorems & Definitions (7)

Proposition 3.1
Remark 3.1
Proposition 3.2
Proposition 3.3: Counting paths and cycles
proof
proof
proof

What Can We Learn from State Space Models for Machine Learning on Graphs?

TL;DR

Abstract

What Can We Learn from State Space Models for Machine Learning on Graphs?

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (7)