Towards Practical GraphRAG: Efficient Knowledge Graph Construction and Hybrid Retrieval at Scale
Congmin Min, Sahil Bansal, Joyce Pan, Abbas Keshavarzi, Rhea Mathew, Amar Viswanathan Kannan
TL;DR
This work tackles the scalability bottlenecks of GraphRAG in enterprise settings by replacing costly LLM-based KG construction with a dependency-parsing pipeline that retains 94% of LLM performance at a fraction of the cost, and by introducing a hybrid retrieval strategy that fuses vector similarity with graph traversal via Reciprocal Rank Fusion. It demonstrates two core pipelines—dependency-based and LLM-based KG construction—and a multi-embedding, one-to-many retrieval framework that preserves entities, chunks, and relations for richer context. On real-world CCM legacy code migration data, GraphRAG delivers measurable gains over dense vector retrieval, including higher context precision and coverage, with the dependency-based approach approaching GPT-4o performance while offering substantial cost and scalability benefits. The results support practical deployment of GraphRAG in production environments and point to future work in broader benchmarks, more advanced graph traversals, and integration with query-time optimizations. Overall, the paper shows that careful engineering of classical NLP components can rival modern LLM-based methods for scalable, domain-adaptable knowledge-grounded retrieval-augmented reasoning at enterprise scale.
Abstract
We propose a scalable and cost-efficient framework for deploying Graph-based Retrieval-Augmented Generation (GraphRAG) in enterprise environments. While GraphRAG has shown promise for multi- hop reasoning and structured retrieval, its adoption has been limited due to reliance on expensive large language model (LLM)-based extraction and complex traversal strategies. To address these challenges, we introduce two core innovations: (1) an efficient knowledge graph construction pipeline that leverages dependency parsing to achieve 94% of LLM-based performance (61.87% vs. 65.83%) while significantly reducing costs and improving scalability; and (2) a hybrid retrieval strategy that fuses vector similarity with graph traversal using Reciprocal Rank Fusion (RRF), maintaining separate embeddings for entities, chunks, and relations to enable multi-granular matching. We evaluate our framework on two enterprise datasets focused on legacy code migration and demonstrate improvements of up to 15% and 4.35% over vanilla vector retrieval baselines using LLM-as-Judge evaluation metrics. These results validate the feasibility of deploying GraphRAG in production enterprise environments, demonstrating that careful engineering of classical NLP techniques can match modern LLM-based approaches while enabling practical, cost-effective, and domain-adaptable retrieval-augmented reasoning at scale.
