Multi-Agent GraphRAG: A Text-to-Cypher Framework for Labeled Property Graphs
Anton Gusarov, Anastasia Volkova, Valentin Khrulkov, Andrey Kuznetsov, Evgenii Maslov, Ivan Oseledets
TL;DR
The paper addresses the challenge of building natural language interfaces for Cypher-based querying over labeled property graphs by introducing Multi-Agent GraphRAG, a modular, agent-powered workflow that iteratively generates, executes, and refines Cypher queries with schema-aware prompts and runtime verification on Memgraph. The approach combines query generation, entity verification, execution, and feedback aggregation to reduce hallucinations and syntactic errors, guided by a self-correction loop. Evaluations on CypherBench and an IFC-derived building dataset show consistent improvements over single-pass baselines across multiple LLMs, including improved grounding and handling of complex queries. The work demonstrates practical potential for AI-assisted data access in industrial domains like digital construction, while outlining directions for handling compositional queries and expanding domain-scale datasets.
Abstract
While Retrieval-Augmented Generation (RAG) methods commonly draw information from unstructured documents, the emerging paradigm of GraphRAG aims to leverage structured data such as knowledge graphs. Most existing GraphRAG efforts focus on Resource Description Framework (RDF) knowledge graphs, relying on triple representations and SPARQL queries. However, the potential of Cypher and Labeled Property Graph (LPG) databases to serve as scalable and effective reasoning engines within GraphRAG pipelines remains underexplored in current research literature. To fill this gap, we propose Multi-Agent GraphRAG, a modular LLM agentic system for text-to-Cypher query generation serving as a natural language interface to LPG-based graph data. Our proof-of-concept system features an LLM-based workflow for automated Cypher queries generation and execution, using Memgraph as the graph database backend. Iterative content-aware correction and normalization, reinforced by an aggregated feedback loop, ensures both semantic and syntactic refinement of generated queries. We evaluate our system on the CypherBench graph dataset covering several general domains with diverse types of queries. In addition, we demonstrate performance of the proposed workflow on a property graph derived from the IFC (Industry Foundation Classes) data, representing a digital twin of a building. This highlights how such an approach can bridge AI with real-world applications at scale, enabling industrial digital automation use cases.
