Table of Contents
Fetching ...

When Structure Doesn't Help: LLMs Do Not Read Text-Attributed Graphs as Effectively as We Expected

Haotian Xu, Yuning You, Tengfei Ma

TL;DR

The paper interrogates whether explicit graph structure improves LLM-based graph reasoning, across text-attributed graphs and molecular graphs. Through systematic ablations of template-based encodings, GNN adapters, and various backbones, it finds that rich node semantics largely drive performance, while structural priors (e.g., Laplacian positional encodings, message passing) offer marginal or negative gains. This challenges the conventional emphasis on topology in graph learning and suggests a shift toward semantics-driven representations and node sequencing to leverage LLM capabilities. The findings hold across diverse tasks and datasets, including TAGs and molecular benchmarks, with implications for designing future graph foundation models that prioritize meaningful textual context over handcrafted structural cues.

Abstract

Graphs provide a unified representation of semantic content and relational structure, making them a natural fit for domains such as molecular modeling, citation networks, and social graphs. Meanwhile, large language models (LLMs) have excelled at understanding natural language and integrating cross-modal signals, sparking interest in their potential for graph reasoning. Recent work has explored this by either designing template-based graph templates or using graph neural networks (GNNs) to encode structural information. In this study, we investigate how different strategies for encoding graph structure affect LLM performance on text-attributed graphs. Surprisingly, our systematic experiments reveal that: (i) LLMs leveraging only node textual descriptions already achieve strong performance across tasks; and (ii) most structural encoding strategies offer marginal or even negative gains. We show that explicit structural priors are often unnecessary and, in some cases, counterproductive when powerful language models are involved. This represents a significant departure from traditional graph learning paradigms and highlights the need to rethink how structure should be represented and utilized in the LLM era. Our study is to systematically challenge the foundational assumption that structure is inherently beneficial for LLM-based graph reasoning, opening the door to new, semantics-driven approaches for graph learning.

When Structure Doesn't Help: LLMs Do Not Read Text-Attributed Graphs as Effectively as We Expected

TL;DR

The paper interrogates whether explicit graph structure improves LLM-based graph reasoning, across text-attributed graphs and molecular graphs. Through systematic ablations of template-based encodings, GNN adapters, and various backbones, it finds that rich node semantics largely drive performance, while structural priors (e.g., Laplacian positional encodings, message passing) offer marginal or negative gains. This challenges the conventional emphasis on topology in graph learning and suggests a shift toward semantics-driven representations and node sequencing to leverage LLM capabilities. The findings hold across diverse tasks and datasets, including TAGs and molecular benchmarks, with implications for designing future graph foundation models that prioritize meaningful textual context over handcrafted structural cues.

Abstract

Graphs provide a unified representation of semantic content and relational structure, making them a natural fit for domains such as molecular modeling, citation networks, and social graphs. Meanwhile, large language models (LLMs) have excelled at understanding natural language and integrating cross-modal signals, sparking interest in their potential for graph reasoning. Recent work has explored this by either designing template-based graph templates or using graph neural networks (GNNs) to encode structural information. In this study, we investigate how different strategies for encoding graph structure affect LLM performance on text-attributed graphs. Surprisingly, our systematic experiments reveal that: (i) LLMs leveraging only node textual descriptions already achieve strong performance across tasks; and (ii) most structural encoding strategies offer marginal or even negative gains. We show that explicit structural priors are often unnecessary and, in some cases, counterproductive when powerful language models are involved. This represents a significant departure from traditional graph learning paradigms and highlights the need to rethink how structure should be represented and utilized in the LLM era. Our study is to systematically challenge the foundational assumption that structure is inherently beneficial for LLM-based graph reasoning, opening the door to new, semantics-driven approaches for graph learning.

Paper Structure

This paper contains 20 sections, 5 figures, 7 tables.

Figures (5)

  • Figure 1: We present a common paradigm for aligning graph type data into LLMs. On the left, one needs to define the graph (citation network, molecule, protein, etc) and parameterize it with proper structures. In the middle, we briefly delineate the strategies encoding graphs into a LLM-favored representations: Template-based encoding will arrange each node inside graph according to a pre-defined sequence, while GNN-based encoding is to have a pretrained or random initialized GNN module to encode graphs into LLM hidden space. On the right is the pipeline to align graph modality into LLMs.
  • Figure 2: Increasing the number of adapter layers leads to notable performance degradation for GNN-based adapters, particularly GIN, which loses much of its generalizability in deeper configurations. In contrast, MLP adapters, without relying on structural information, maintain stable performance and exhibit greater robustness across varying depths.
  • Figure 3: How Pretrained Encoders Impact
  • Figure 4: Features for LLMs on GDL.
  • Figure 5: Left: Though reasoning model can perform structured decision-making, it does not rely on structure information. Right: Altering the node sequence via GDC can gain some enhancement at a time.