Table of Contents
Fetching ...

LangGFM: A Large Language Model Alone Can be a Powerful Graph Foundation Model

Tianqianjin Lin, Pengwei Yan, Kaisong Song, Zhuoren Jiang, Yangyang Kang, Jun Lin, Weikang Yuan, Junjie Cao, Changlong Sun, Xiaozhong Liu

TL;DR

To enhance consistency, coverage, and diversity across domains, tasks, and research interests within the graph learning community in the evaluation of GFMs, this work proposes GFMBench-a systematic and comprehensive benchmark comprising 26 datasets and introduces LangGFM, a novel GFM that relies entirely on large language models.

Abstract

Graph foundation models (GFMs) have recently gained significant attention. However, the unique data processing and evaluation setups employed by different studies hinder a deeper understanding of their progress. Additionally, current research tends to focus on specific subsets of graph learning tasks, such as structural tasks, node-level tasks, or classification tasks. As a result, they often incorporate specialized modules tailored to particular task types, losing their applicability to other graph learning tasks and contradicting the original intent of foundation models to be universal. Therefore, to enhance consistency, coverage, and diversity across domains, tasks, and research interests within the graph learning community in the evaluation of GFMs, we propose GFMBench-a systematic and comprehensive benchmark comprising 26 datasets. Moreover, we introduce LangGFM, a novel GFM that relies entirely on large language models. By revisiting and exploring the effective graph textualization principles, as well as repurposing successful techniques from graph augmentation and graph self-supervised learning within the language space, LangGFM achieves performance on par with or exceeding the state of the art across GFMBench, which can offer us new perspectives, experiences, and baselines to drive forward the evolution of GFMs.

LangGFM: A Large Language Model Alone Can be a Powerful Graph Foundation Model

TL;DR

To enhance consistency, coverage, and diversity across domains, tasks, and research interests within the graph learning community in the evaluation of GFMs, this work proposes GFMBench-a systematic and comprehensive benchmark comprising 26 datasets and introduces LangGFM, a novel GFM that relies entirely on large language models.

Abstract

Graph foundation models (GFMs) have recently gained significant attention. However, the unique data processing and evaluation setups employed by different studies hinder a deeper understanding of their progress. Additionally, current research tends to focus on specific subsets of graph learning tasks, such as structural tasks, node-level tasks, or classification tasks. As a result, they often incorporate specialized modules tailored to particular task types, losing their applicability to other graph learning tasks and contradicting the original intent of foundation models to be universal. Therefore, to enhance consistency, coverage, and diversity across domains, tasks, and research interests within the graph learning community in the evaluation of GFMs, we propose GFMBench-a systematic and comprehensive benchmark comprising 26 datasets. Moreover, we introduce LangGFM, a novel GFM that relies entirely on large language models. By revisiting and exploring the effective graph textualization principles, as well as repurposing successful techniques from graph augmentation and graph self-supervised learning within the language space, LangGFM achieves performance on par with or exceeding the state of the art across GFMBench, which can offer us new perspectives, experiences, and baselines to drive forward the evolution of GFMs.

Paper Structure

This paper contains 29 sections, 8 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Comparison among GNNs and GFMs across node-, edge- and graph-level tasks. Results of OFA iclr2024ofa, LLaGA icml2024llaga, Graph2Seq icml2024graph2seq, and Graph2Token icml2024graphtoken are all sourced from the best-reported results in their works. It's observed: (1) Different works claim performance within significantly different intervals on the same-name dataset, e.g., the results of OFA and LLaGA in node classification, or they use different metrics, e.g., the metrics employed by OFA and LLaGA in link prediction; (2) Converting graphs into texts and performing instruction tuning on LLaMA3-8B-Instruct can consistently outperform the current state-of-the-art. These findings motivate us to develop a benchmark to facilitate fair comparison among GFMs and to provide a simple yet powerful GFM for future development.
  • Figure 2: An illustration of instruction tuning for an LLM to perform graph tasks.
  • Figure 3: Preference of LLM for different graph formats.
  • Figure 4: Various Formats as Graph Augmentations.
  • Figure 5: Effect of the proposed self-supervised learning.