Towards Neural Graph Data Management

Yufei Li; Yisen Gao; Jiaxin Bai; Jiaxuan Xiong; Haoyu Huang; Zhongwei Xie; Hong Ting Tsang; Yangqiu Song

Towards Neural Graph Data Management

Yufei Li, Yisen Gao, Jiaxin Bai, Jiaxuan Xiong, Haoyu Huang, Zhongwei Xie, Hong Ting Tsang, Yangqiu Song

TL;DR

NGDBench is introduced, a unified benchmark for evaluating neural graph database capabilities across five diverse domains, including finance, medicine, and AI agent tooling, and supports the full Cypher query language, enabling complex pattern matching, variable-length paths, and numerical aggregations.

Abstract

While AI systems have made remarkable progress in processing unstructured text, structured data such as graphs stored in databases, continues to grow rapidly yet remains difficult for neural models to effectively utilize. We introduce NGDBench, a unified benchmark for evaluating neural graph database capabilities across five diverse domains, including finance, medicine, and AI agent tooling. Unlike prior benchmarks limited to elementary logical operations, NGDBench supports the full Cypher query language, enabling complex pattern matching, variable-length paths, and numerical aggregations, while incorporating realistic noise injection and dynamic data management operations. Our evaluation of state-of-the-art LLMs and RAG methods reveals significant limitations in structured reasoning, noise robustness, and analytical precision, establishing NGDBench as a critical testbed for advancing neural graph data management. Our code and data are available at https://github.com/HKUST-KnowComp/NGDBench.

Towards Neural Graph Data Management

TL;DR

Abstract

Paper Structure (42 sections, 12 equations, 3 figures, 8 tables, 1 algorithm)

This paper contains 42 sections, 12 equations, 3 figures, 8 tables, 1 algorithm.

Introduction
Related Work
NGDBench
Data Preparation
Structured Domains
Unstructured Domains
Perturbation Generation
Query Generation
Query Template Library Construction
Perturbation-Aware Query Sampler
Task Formulation
Task I: Robust Analytical Query Answering
Problem Definition.
Task II: Sequential In-Context Graph Editing for Dynamic Graph Management
Problem Definition.
...and 27 more sections

Figures (3)

Figure 1: Motivations, Challenges, and Contributions of NGDBench.
Figure 2: The NGDBench Framework Pipeline. (1) Data Collection & Unification: Diverse structured and unstructured domains are standardized into a unified LPG graph representation. (2) Core Construction: A Perturbation Engine generates paired Clean Ground Truth and Observed Noisy Graphs, alongside a query sampling engine to generate queries according to template libraries. (3) Evaluation Tasks: Systems are benchmarked on two tasks: Robust Analytical QA over noisy data and Dynamic Graph Management (CUD operations) verified against ground truth.
Figure 3: The left figure is the evaluation of boolean queries on three datasets. The right figure is the evaluation of dynamic steps on NGD-Prime, where lower MLRE indicates better performance.

Towards Neural Graph Data Management

TL;DR

Abstract

Towards Neural Graph Data Management

Authors

TL;DR

Abstract

Table of Contents

Figures (3)