RiemannGFM: Learning a Graph Foundation Model from Riemannian Geometry
Li Sun, Zhenhao Huang, Suyang Zhou, Qiqi Wan, Hao Peng, Philip Yu
TL;DR
RiemannGFM addresses the limitation that existing graph foundation models rely heavily on textual attributes and Euclidean embeddings, hindering transfer to non-text graphs. It introduces a universal pretraining framework that learns a structural vocabulary of trees and cycles through a product bundle of constant-curvature spaces, with a Vocabulary Learning Module (cross-geometry attention) and a Global Learning Module (bundle convolution) trained via geometric contrastive learning. The approach enables cross-domain transfer, effective few-shot learning, and demonstrates that the learned structural knowledge is expressive and transferable across graph types, with pre-training dataset choice showing robustness. This work highlights the potential of leveraging Riemannian geometry to capture universal graph structure for scalable, domain-agnostic graph representations.
Abstract
The foundation model has heralded a new era in artificial intelligence, pretraining a single model to offer cross-domain transferability on different datasets. Graph neural networks excel at learning graph data, the omnipresent non-Euclidean structure, but often lack the generalization capacity. Hence, graph foundation model is drawing increasing attention, and recent efforts have been made to leverage Large Language Models. On the one hand, existing studies primarily focus on text-attributed graphs, while a wider range of real graphs do not contain fruitful textual attributes. On the other hand, the sequential graph description tailored for the Large Language Model neglects the structural complexity, which is a predominant characteristic of the graph. Such limitations motivate an important question: Can we go beyond Large Language Models, and pretrain a universal model to learn the structural knowledge for any graph? The answer in the language or vision domain is a shared vocabulary. We observe the fact that there also exist shared substructures underlying graph domain, and thereby open a new opportunity of graph foundation model with structural vocabulary. The key innovation is the discovery of a simple yet effective structural vocabulary of trees and cycles, and we explore its inherent connection to Riemannian geometry. Herein, we present a universal pretraining model, RiemannGFM. Concretely, we first construct a novel product bundle to incorporate the diverse geometries of the vocabulary. Then, on this constructed space, we stack Riemannian layers where the structural vocabulary, regardless of specific graph, is learned in Riemannian manifold offering cross-domain transferability. Extensive experiments show the effectiveness of RiemannGFM on a diversity of real graphs.
