Message-Passing State-Space Models: Improving Graph Learning with Modern Sequence Modeling
Andrea Ceni, Alessio Gravina, Claudio Gallicchio, Davide Bacciu, Carola-Bibiane Schonlieb, Moshe Eliasof
TL;DR
MP-SSM introduces a principled integration of modern state-space modeling into the message-passing framework to enable stable, long-range information propagation on both static and temporal graphs while preserving permutation equivariance. The core idea is a linear diffusion on graphs via a recurrence X_{t+1} = A X_t W + U_{t+1} B, followed by a graph-agnostic MLP, with deep stacking yielding large effective receptive fields without nonlinear diffusion leverage. The authors provide exact Jacobian-based sensitivity analysis, derive lower bounds on gradient flow, and show MP-SSM mitigates oversquashing and vanishing gradients in deep regimes, supported by a fast parallel implementation. Empirically, MP-SSM achieves state-of-the-art or strong performance across long-range propagation, heterophilic, and spatio-temporal forecasting benchmarks, while maintaining runtimes comparable to standard GCNs. These results demonstrate MP-SSM as a versatile and scalable framework for graph learning with rigorous theoretical grounding and broad applicability.
Abstract
The recent success of State-Space Models (SSMs) in sequence modeling has motivated their adaptation to graph learning, giving rise to Graph State-Space Models (GSSMs). However, existing GSSMs operate by applying SSM modules to sequences extracted from graphs, often compromising core properties such as permutation equivariance, message-passing compatibility, and computational efficiency. In this paper, we introduce a new perspective by embedding the key principles of modern SSM computation directly into the Message-Passing Neural Network framework, resulting in a unified methodology for both static and temporal graphs. Our approach, MP-SSM, enables efficient, permutation-equivariant, and long-range information propagation while preserving the architectural simplicity of message passing. Crucially, MP-SSM enables an exact sensitivity analysis, which we use to theoretically characterize information flow and evaluate issues like vanishing gradients and over-squashing in the deep regime. Furthermore, our design choices allow for a highly optimized parallel implementation akin to modern SSMs. We validate MP-SSM across a wide range of tasks, including node classification, graph property prediction, long-range benchmarks, and spatiotemporal forecasting, demonstrating both its versatility and strong empirical performance.
