Colorful Talks with Graphs: Human-Interpretable Graph Encodings for Large Language Models
Angelo Zangari, Peyman Baghershahi, Sourav Medya
TL;DR
The paper tackles the mismatch between graph-structured reasoning and text-based LLMs by introducing CL-OWL, a graph-to-text encoding that maps Weisfeiler–Lehman structural refinement into human-interpretable color tokens within prompts. It provides a principled construction of node-level descriptors via ordered 1-WL, establishes a centrality-like interpretation, and demonstrates that color-based, structure-preserving prompts improve LLM performance on global graph tasks and long-range dependencies across synthetic and real datasets. Theoretical results link ordered WL labels to distance-weighted connectivity, and extensive experiments show that color-enhanced prompts (CL-OWL) outperform baselines on maximum flow, shortest path, and related tasks, with exceptions like triangle counting where local pattern matching dominates. Overall, the approach enhances LLM-based graph reasoning by aligning graph structure with linguistic priors, enabling scalable, interpretable, and effective reasoning over graph data.
Abstract
Graph problems are fundamentally challenging for large language models (LLMs). While LLMs excel at processing unstructured text, graph tasks require reasoning over explicit structure, permutation invariance, and computationally complex relationships, creating a mismatch with the representations of text-based models. Our work investigates how LLMs can be effectively applied to graph problems despite these barriers. We introduce a human-interpretable structural encoding strategy for graph-to-text translation that injects graph structure directly into natural language prompts. Our method involves computing a variant of Weisfeiler-Lehman (WL) similarity classes and maps them to human-like color tokens rather than numeric labels. The key insight is that semantically meaningful and human-interpretable cues may be more effectively processed by LLMs than opaque symbolic encoding. Experimental results on multiple algorithmic and predictive graph tasks show the considerable improvements by our method on both synthetic and real-world datasets. By capturing both local and global-range dependencies, our method enhances LLM performance especially on graph tasks that require reasoning over global graph structure.
