Table of Contents
Fetching ...

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

Qizhi Wang

Abstract

GraphRAG systems improve multi-hop retrieval by modeling structure, but many approaches rely on expensive LLM-based graph construction and GPU-heavy inference. We present SPRIG (Seeded Propagation for Retrieval In Graphs), a CPU-only, linear-time, token-free GraphRAG pipeline that replaces LLM graph building with lightweight NER-driven co-occurrence graphs and uses Personalized PageRank (PPR) for 28% with negligible Recall@10 changes. The results characterize when CPU-friendly graph retrieval helps multi-hop recall and when strong lexical hybrids (RRF) are sufficient, outlining a realistic path to democratizing GraphRAG without token costs or GPU requirements.

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

Abstract

GraphRAG systems improve multi-hop retrieval by modeling structure, but many approaches rely on expensive LLM-based graph construction and GPU-heavy inference. We present SPRIG (Seeded Propagation for Retrieval In Graphs), a CPU-only, linear-time, token-free GraphRAG pipeline that replaces LLM graph building with lightweight NER-driven co-occurrence graphs and uses Personalized PageRank (PPR) for 28% with negligible Recall@10 changes. The results characterize when CPU-friendly graph retrieval helps multi-hop recall and when strong lexical hybrids (RRF) are sufficient, outlining a realistic path to democratizing GraphRAG without token costs or GPU requirements.
Paper Structure (29 sections, 3 equations, 6 figures, 26 tables)

This paper contains 29 sections, 3 equations, 6 figures, 26 tables.

Figures (6)

  • Figure 1: Efficiency curves for index and query time.
  • Figure A.1: Seed size ablation (Recall@10).
  • Figure A.2: Ablation over PPR $\alpha$ and max_iter (Recall@10).
  • Figure A.3: Seed weighting ablation (best Recall@10).
  • Figure A.4: Hub penalty ablation (best Recall@10).
  • ...and 1 more figures