Table of Contents
Fetching ...

GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

Jie Peng, Jiarui Ji, Runlin Lei, Zhewei Wei, Yongchao Liu, Chuntao Hong

TL;DR

GDGB introduces the Generative DyTAG Benchmark to address the lack of high-quality, text-rich dynamic text-attributed graph benchmarks and the absence of standardized generative tasks. It provides eight diverse, text-rich datasets, defines Transductive Dynamic Graph Generation (TDGG) and Inductive Dynamic Graph Generation (IDGG), and proposes GAG-General, an LLM-based multi-agent framework for reproducible DyTAG generation. The benchmark couples three evaluation dimensions—graph structure, textual quality, and graph embeddings—into holistic metrics, and demonstrates through extensive experiments that GDGB enables robust assessment of generative DyTAG methods and reveals the critical role of textual attributes in generation quality. Findings show GDGB outperforms prior DTGB baselines across structural, textual, and embedding fidelity, and that IDGG offers practical benefits for data augmentation in inductive learning, underscoring the framework’s practical relevance for real-world dynamic graph applications.

Abstract

Dynamic Text-Attributed Graphs (DyTAGs), which intricately integrate structural, temporal, and textual attributes, are crucial for modeling complex real-world systems. However, most existing DyTAG datasets exhibit poor textual quality, which severely limits their utility for generative DyTAG tasks requiring semantically rich inputs. Additionally, prior work mainly focuses on discriminative tasks on DyTAGs, resulting in a lack of standardized task formulations and evaluation protocols tailored for DyTAG generation. To address these critical issues, we propose Generative DyTAG Benchmark (GDGB), which comprises eight meticulously curated DyTAG datasets with high-quality textual features for both nodes and edges, overcoming limitations of prior datasets. Building on GDGB, we define two novel DyTAG generation tasks: Transductive Dynamic Graph Generation (TDGG) and Inductive Dynamic Graph Generation (IDGG). TDGG transductively generates a target DyTAG based on the given source and destination node sets, while the more challenging IDGG introduces new node generation to inductively model the dynamic expansion of real-world graph data. To enable holistic evaluation, we design multifaceted metrics that assess the structural, temporal, and textual quality of the generated DyTAGs. We further propose GAG-General, an LLM-based multi-agent generative framework tailored for reproducible and robust benchmarking of DyTAG generation. Experimental results demonstrate that GDGB enables rigorous evaluation of TDGG and IDGG, with key insights revealing the critical interplay of structural and textual features in DyTAG generation. These findings establish GDGB as a foundational resource for advancing generative DyTAG research and unlocking further practical applications in DyTAG generation. The dataset and source code are available at https://github.com/Lucas-PJ/GDGB-ALGO.

GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

TL;DR

GDGB introduces the Generative DyTAG Benchmark to address the lack of high-quality, text-rich dynamic text-attributed graph benchmarks and the absence of standardized generative tasks. It provides eight diverse, text-rich datasets, defines Transductive Dynamic Graph Generation (TDGG) and Inductive Dynamic Graph Generation (IDGG), and proposes GAG-General, an LLM-based multi-agent framework for reproducible DyTAG generation. The benchmark couples three evaluation dimensions—graph structure, textual quality, and graph embeddings—into holistic metrics, and demonstrates through extensive experiments that GDGB enables robust assessment of generative DyTAG methods and reveals the critical role of textual attributes in generation quality. Findings show GDGB outperforms prior DTGB baselines across structural, textual, and embedding fidelity, and that IDGG offers practical benefits for data augmentation in inductive learning, underscoring the framework’s practical relevance for real-world dynamic graph applications.

Abstract

Dynamic Text-Attributed Graphs (DyTAGs), which intricately integrate structural, temporal, and textual attributes, are crucial for modeling complex real-world systems. However, most existing DyTAG datasets exhibit poor textual quality, which severely limits their utility for generative DyTAG tasks requiring semantically rich inputs. Additionally, prior work mainly focuses on discriminative tasks on DyTAGs, resulting in a lack of standardized task formulations and evaluation protocols tailored for DyTAG generation. To address these critical issues, we propose Generative DyTAG Benchmark (GDGB), which comprises eight meticulously curated DyTAG datasets with high-quality textual features for both nodes and edges, overcoming limitations of prior datasets. Building on GDGB, we define two novel DyTAG generation tasks: Transductive Dynamic Graph Generation (TDGG) and Inductive Dynamic Graph Generation (IDGG). TDGG transductively generates a target DyTAG based on the given source and destination node sets, while the more challenging IDGG introduces new node generation to inductively model the dynamic expansion of real-world graph data. To enable holistic evaluation, we design multifaceted metrics that assess the structural, temporal, and textual quality of the generated DyTAGs. We further propose GAG-General, an LLM-based multi-agent generative framework tailored for reproducible and robust benchmarking of DyTAG generation. Experimental results demonstrate that GDGB enables rigorous evaluation of TDGG and IDGG, with key insights revealing the critical interplay of structural and textual features in DyTAG generation. These findings establish GDGB as a foundational resource for advancing generative DyTAG research and unlocking further practical applications in DyTAG generation. The dataset and source code are available at https://github.com/Lucas-PJ/GDGB-ALGO.

Paper Structure

This paper contains 55 sections, 2 equations, 12 figures, 69 tables, 2 algorithms.

Figures (12)

  • Figure 1: Comparison of node and edge texts across GDGB and DTGB datasets in terms of length, perplexity (PPL (Reversed)), and LLM-based rating.
  • Figure 2: A case study on TDGG and IDGG in the Sephora product reviews scenario.
  • Figure 3: Distribution of node degree on GDGB datasets.
  • Figure 4: Distribution of the number of edges for each label on GDGB datasets.
  • Figure 5: Left: Average node text lengths on GDGB and DTGB datasets. In non-bipartite graphs, text lengths are averaged across all nodes. For bipartite graphs, averages are calculated for source nodes. Right: Average edge text lengths on each dataset.
  • ...and 7 more figures