Table of Contents
Fetching ...

How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension

Xinnan Dai, Haohao Qu, Yifen Shen, Bohang Zhang, Qihao Wen, Wenqi Fan, Dongsheng Li, Jiliang Tang, Caihua Shan

TL;DR

This work investigates whether large language models can understand and reason about graph patterns by introducing a comprehensive benchmark that spans terminology-based, topology-based, and data-driven descriptions. The evaluation covers 11 tasks across 7 models using synthetic and real-world graphs, with input formats in adjacency-list and edge-list representations, and includes tasks such as pattern translation, isomorphic mapping, graph modification, pattern detection, dense-subgraph mining, frequent subgraph extraction, and discriminative pattern learning. Key findings reveal that LLMs possess preliminary capabilities to understand graph patterns, with O1-mini frequently delivering the strongest performance, and that formatting inputs to align with pretraining improves results; however, strategies differ from conventional graph algorithms and hallucinations occur in some cases. The benchmark provides a scalable, extensible framework to probe graph-pattern reasoning in LLMs and informs prompting and architectural design for graph-centric AI systems, supporting progress toward more reliable graph-aware reasoning in real-world applications.

Abstract

Benchmarking the capabilities and limitations of large language models (LLMs) in graph-related tasks is becoming an increasingly popular and crucial area of research. Recent studies have shown that LLMs exhibit a preliminary ability to understand graph structures and node features. However, the potential of LLMs in graph pattern mining remains largely unexplored. This is a key component in fields such as computational chemistry, biology, and social network analysis. To bridge this gap, this work introduces a comprehensive benchmark to assess LLMs' capabilities in graph pattern tasks. We have developed a benchmark that evaluates whether LLMs can understand graph patterns based on either terminological or topological descriptions. Additionally, our benchmark tests the LLMs' capacity to autonomously discover graph patterns from data. The benchmark encompasses both synthetic and real datasets, and a variety of models, with a total of 11 tasks and 7 models. Our experimental framework is designed for easy expansion to accommodate new models and datasets. Our findings reveal that: (1) LLMs have preliminary abilities to understand graph patterns, with O1-mini outperforming in the majority of tasks; (2) Formatting input data to align with the knowledge acquired during pretraining can enhance performance; (3) The strategies employed by LLMs may differ from those used in conventional algorithms.

How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension

TL;DR

This work investigates whether large language models can understand and reason about graph patterns by introducing a comprehensive benchmark that spans terminology-based, topology-based, and data-driven descriptions. The evaluation covers 11 tasks across 7 models using synthetic and real-world graphs, with input formats in adjacency-list and edge-list representations, and includes tasks such as pattern translation, isomorphic mapping, graph modification, pattern detection, dense-subgraph mining, frequent subgraph extraction, and discriminative pattern learning. Key findings reveal that LLMs possess preliminary capabilities to understand graph patterns, with O1-mini frequently delivering the strongest performance, and that formatting inputs to align with pretraining improves results; however, strategies differ from conventional graph algorithms and hallucinations occur in some cases. The benchmark provides a scalable, extensible framework to probe graph-pattern reasoning in LLMs and informs prompting and architectural design for graph-centric AI systems, supporting progress toward more reliable graph-aware reasoning in real-world applications.

Abstract

Benchmarking the capabilities and limitations of large language models (LLMs) in graph-related tasks is becoming an increasingly popular and crucial area of research. Recent studies have shown that LLMs exhibit a preliminary ability to understand graph structures and node features. However, the potential of LLMs in graph pattern mining remains largely unexplored. This is a key component in fields such as computational chemistry, biology, and social network analysis. To bridge this gap, this work introduces a comprehensive benchmark to assess LLMs' capabilities in graph pattern tasks. We have developed a benchmark that evaluates whether LLMs can understand graph patterns based on either terminological or topological descriptions. Additionally, our benchmark tests the LLMs' capacity to autonomously discover graph patterns from data. The benchmark encompasses both synthetic and real datasets, and a variety of models, with a total of 11 tasks and 7 models. Our experimental framework is designed for easy expansion to accommodate new models and datasets. Our findings reveal that: (1) LLMs have preliminary abilities to understand graph patterns, with O1-mini outperforming in the majority of tasks; (2) Formatting input data to align with the knowledge acquired during pretraining can enhance performance; (3) The strategies employed by LLMs may differ from those used in conventional algorithms.
Paper Structure (42 sections, 5 theorems, 4 equations, 8 figures, 35 tables, 3 algorithms)

This paper contains 42 sections, 5 theorems, 4 equations, 8 figures, 35 tables, 3 algorithms.

Key Result

Theorem 1

(Informal) For any LOCAL algorithm A, there exists a Transformer with edge list as input that can simulate A.

Figures (8)

  • Figure 1: The F1 score for terminology-based pattern detection (small and medium scale)
  • Figure 2: The influence of underlying algorithms used in pattern isomorphic mapping
  • Figure 3: The F1 score of topology-based pattern detection (small and medium scale)
  • Figure 4: The precision in various node degrees
  • Figure 5: The frequency of extracted patterns
  • ...and 3 more figures

Theorems & Definitions (7)

  • Theorem 1
  • Theorem 2: Formal version of Theorem \ref{['thm:representation']}
  • proof
  • Lemma 1: Equivalence between MPGNNs and LOCAL
  • Lemma 2: Representing MPGNNs by Transformers
  • proof
  • Lemma 3