GraphChain: Large Language Models for Large-scale Graph Analysis via Tool Chaining
Chunyu Wei, Wenji Hu, Xingjia Hao, Xin Wang, Yifan Yang, Yueguo Chen, Yang Tian, Yunhai Wang
TL;DR
GraphChain addresses the core challenge of applying large language models to large graphs by introducing dynamic tool chaining. It combines Progressive Graph Distillation to learn compact, task-focused sequences and Structure-aware Test-Time Adaptation to tailor strategies to graph topology via a lightweight adapter and Laplacian-based fingerprints. The approach leverages a graph processing tool library and reinforcement learning (PPO with GAE) to optimize tool sequences while respecting memory and relevance constraints, framed through an information bottleneck lens. Empirically, GraphChain outperforms state-of-the-art baselines by about 20.7% relative accuracy on diverse graph domains and scales to graphs with up to ~200,000 nodes, with strong transferability and robustness across models and tool sets, highlighting practical impact for scalable graph analytics with LLMs.
Abstract
Large Language Models (LLMs) face significant limitations when applied to large-scale graphs, struggling with context constraints and inflexible reasoning. We present GraphChain, a framework that enables LLMs to analyze complex graphs through dynamic sequences of specialized tools, mimicking human exploratory intelligence. Our approach introduces two key innovations: (1) Progressive Graph Distillation, a reinforcement learning mechanism that generates optimized tool sequences balancing task relevance with information compression, and (2) Structure-aware Test-Time Adaptation, which efficiently tailors tool selection strategies to diverse graph topologies using spectral properties and lightweight adapters without costly retraining. Experiments show GraphChain significantly outperforms prior methods, enabling scalable and adaptive LLM-driven graph analysis.
