Table of Contents
Fetching ...

Graph Learning in the Era of LLMs: A Survey from the Perspective of Data, Models, and Tasks

Xunkai Li, Zhengyu Wu, Jiayi Wu, Hanwen Cui, Jishuo Jia, Rong-Hua Li, Guoren Wang

TL;DR

This survey addresses the problem of integrating Graph Neural Networks with Large Language Models to exploit both graph structure and rich textual descriptions in Text-Attributed Graphs. It introduces a data–model–task framework, classifies approaches into five collaboration paradigms (independent modules, GNN-enhanced LLM, LLM-enhanced GNN, GNN-only, LLM-only), and surveys training paradigms (pre-training, fine-tuning, inference) across single- and multi-domain tasks. Key contributions include a taxonomy of datasets by domain, a consolidated overview of methods, and a roadmap toward Graph Foundation-like capabilities with cross-domain generalization and graph-aware reasoning. The work emphasizes data-centric design and cross-domain applicability, offering guidance for future research and practical deployment in industrial TAG settings.

Abstract

With the increasing prevalence of cross-domain Text-Attributed Graph (TAG) Data (e.g., citation networks, recommendation systems, social networks, and ai4science), the integration of Graph Neural Networks (GNNs) and Large Language Models (LLMs) into a unified Model architecture (e.g., LLM as enhancer, LLM as collaborators, LLM as predictor) has emerged as a promising technological paradigm. The core of this new graph learning paradigm lies in the synergistic combination of GNNs' ability to capture complex structural relationships and LLMs' proficiency in understanding informative contexts from the rich textual descriptions of graphs. Therefore, we can leverage graph description texts with rich semantic context to fundamentally enhance Data quality, thereby improving the representational capacity of model-centric approaches in line with data-centric machine learning principles. By leveraging the strengths of these distinct neural network architectures, this integrated approach addresses a wide range of TAG-based Task (e.g., graph learning, graph reasoning, and graph question answering), particularly in complex industrial scenarios (e.g., supervised, few-shot, and zero-shot settings). In other words, we can treat text as a medium to enable cross-domain generalization of graph learning Model, allowing a single graph model to effectively handle the diversity of downstream graph-based Task across different data domains. This work serves as a foundational reference for researchers and practitioners looking to advance graph learning methodologies in the rapidly evolving landscape of LLM. We consistently maintain the related open-source materials at \url{https://github.com/xkLi-Allen/Awesome-GNN-in-LLMs-Papers}.

Graph Learning in the Era of LLMs: A Survey from the Perspective of Data, Models, and Tasks

TL;DR

This survey addresses the problem of integrating Graph Neural Networks with Large Language Models to exploit both graph structure and rich textual descriptions in Text-Attributed Graphs. It introduces a data–model–task framework, classifies approaches into five collaboration paradigms (independent modules, GNN-enhanced LLM, LLM-enhanced GNN, GNN-only, LLM-only), and surveys training paradigms (pre-training, fine-tuning, inference) across single- and multi-domain tasks. Key contributions include a taxonomy of datasets by domain, a consolidated overview of methods, and a roadmap toward Graph Foundation-like capabilities with cross-domain generalization and graph-aware reasoning. The work emphasizes data-centric design and cross-domain applicability, offering guidance for future research and practical deployment in industrial TAG settings.

Abstract

With the increasing prevalence of cross-domain Text-Attributed Graph (TAG) Data (e.g., citation networks, recommendation systems, social networks, and ai4science), the integration of Graph Neural Networks (GNNs) and Large Language Models (LLMs) into a unified Model architecture (e.g., LLM as enhancer, LLM as collaborators, LLM as predictor) has emerged as a promising technological paradigm. The core of this new graph learning paradigm lies in the synergistic combination of GNNs' ability to capture complex structural relationships and LLMs' proficiency in understanding informative contexts from the rich textual descriptions of graphs. Therefore, we can leverage graph description texts with rich semantic context to fundamentally enhance Data quality, thereby improving the representational capacity of model-centric approaches in line with data-centric machine learning principles. By leveraging the strengths of these distinct neural network architectures, this integrated approach addresses a wide range of TAG-based Task (e.g., graph learning, graph reasoning, and graph question answering), particularly in complex industrial scenarios (e.g., supervised, few-shot, and zero-shot settings). In other words, we can treat text as a medium to enable cross-domain generalization of graph learning Model, allowing a single graph model to effectively handle the diversity of downstream graph-based Task across different data domains. This work serves as a foundational reference for researchers and practitioners looking to advance graph learning methodologies in the rapidly evolving landscape of LLM. We consistently maintain the related open-source materials at \url{https://github.com/xkLi-Allen/Awesome-GNN-in-LLMs-Papers}.

Paper Structure

This paper contains 12 sections, 5 tables.