Modeling Hypergraph Using Large Language Models
Bingqiao Gu, Jiale Zeng, Xingqin Qi, Dong Li
TL;DR
This work tackles the scarcity of large-scale real hypergraph data by introducing HyperLLM, an LLM-driven hypergraph generator that uses a two-phase framework and a four-agent collaboration to synthesize realistic, semantically coherent hypergraphs. By grounding generation in eight universal real-world patterns and a microscopic high-order preferential attachment model, HyperLLM achieves high fidelity with minimal prior information. The method combines construction-phase iterative generation with a multi-agent evolution phase guided by high-order prompts, enabling efficient yet rich hypergraph synthesis. Empirical results on eight datasets demonstrate superior structural and dynamic pattern alignment, suggesting LLM-based, agent-driven frameworks as a promising direction for scalable, realistic hypergraph modeling.
Abstract
Due to the advantages of hypergraphs in modeling high-order relationships in complex systems, they have been applied to higher-order clustering, hypergraph neural networks and computer vision. These applications rely heavily on access to high-quality, large-scale real-world hypergraph data. Yet, compared to traditional pairwise graphs, real hypergraph datasets remain scarce in both scale and diversity. This shortage significantly limits the development and evaluation of advanced hypergraph learning algorithms. Therefore, how to quickly generate large-scale hypergraphs that conform to the characteristics of real networks is a crucial task that has not received sufficient attention. Motivated by recent advances in large language models (LLMs), particularly their capabilities in semantic reasoning, structured generation, and simulating human behavior, we investigate whether LLMs can facilitate hypergraph generation from a fundamentally new perspective. We introduce HyperLLM, a novel LLM-driven hypergraph generator that simulates the formation and evolution of hypergraphs through a multi-agent collaboration. The framework integrates prompts and structural feedback mechanisms to ensure that the generated hypergraphs reflect key real-world patterns. Extensive experiments across diverse datasets demonstrate that HyperLLM achieves superior fidelity to structural and temporal hypergraph patterns, while requiring minimal statistical priors. Our findings suggest that LLM-based frameworks offer a promising new direction for hypergraph modeling.
