Table of Contents
Fetching ...

SCOUT-RAG: Scalable and Cost-Efficient Unifying Traversal for Agentic Graph-RAG over Distributed Domains

Longkun Li, Yuanben Zou, Jinghan Wu, Yuqing Wen, Jing Li, Hangwei Qian, Ivor Tsang

TL;DR

SCOUT-RAG addresses the challenge of scalable, cost-efficient reasoning over distributed, access-controlled knowledge by formulating retrieval as a sequential decision problem across multiple domains. It introduces a three-stage, four-agent architecture that estimates domain relevance, seeds domain-scoped retrieval, and iteratively refines answers under explicit budget constraints, without centralized graph access. The approach combines training-free relevance estimation, adaptive local/global traversal, and quality-guided expansion to balance retrieval cost and grounding fidelity. Experimental results on 89 multi-domain queries across 45 countries show SCOUT-RAG achieves near-parallel performance to centralized baselines like DRIFT while delivering substantial reductions in tokens and latency, demonstrating practical scalability for cross-domain knowledge synthesis.

Abstract

Graph-RAG improves LLM reasoning using structured knowledge, yet conventional designs rely on a centralized knowledge graph. In distributed and access-restricted settings (e.g., hospitals or multinational organizations), retrieval must select relevant domains and appropriate traversal depth without global graph visibility or exhaustive querying. To address this challenge, we introduce \textbf{SCOUT-RAG} (\textit{\underline{S}calable and \underline{CO}st-efficient \underline{U}nifying \underline{T}raversal}), a distributed agentic Graph-RAG framework that performs progressive cross-domain retrieval guided by incremental utility goals. SCOUT-RAG employs four cooperative agents that: (i) estimate domain relevance, (ii) decide when to expand retrieval to additional domains, (iii) adapt traversal depth to avoid unnecessary graph exploration, and (iv) synthesize the high-quality answers. The framework is designed to minimize retrieval regret, defined as missing useful domain information, while controlling latency and API cost. Across multi-domain knowledge settings, SCOUT-RAG achieves performance comparable to centralized baselines, including DRIFT and exhaustive domain traversal, while substantially reducing cross-domain calls, total tokens processed, and latency.

SCOUT-RAG: Scalable and Cost-Efficient Unifying Traversal for Agentic Graph-RAG over Distributed Domains

TL;DR

SCOUT-RAG addresses the challenge of scalable, cost-efficient reasoning over distributed, access-controlled knowledge by formulating retrieval as a sequential decision problem across multiple domains. It introduces a three-stage, four-agent architecture that estimates domain relevance, seeds domain-scoped retrieval, and iteratively refines answers under explicit budget constraints, without centralized graph access. The approach combines training-free relevance estimation, adaptive local/global traversal, and quality-guided expansion to balance retrieval cost and grounding fidelity. Experimental results on 89 multi-domain queries across 45 countries show SCOUT-RAG achieves near-parallel performance to centralized baselines like DRIFT while delivering substantial reductions in tokens and latency, demonstrating practical scalability for cross-domain knowledge synthesis.

Abstract

Graph-RAG improves LLM reasoning using structured knowledge, yet conventional designs rely on a centralized knowledge graph. In distributed and access-restricted settings (e.g., hospitals or multinational organizations), retrieval must select relevant domains and appropriate traversal depth without global graph visibility or exhaustive querying. To address this challenge, we introduce \textbf{SCOUT-RAG} (\textit{\underline{S}calable and \underline{CO}st-efficient \underline{U}nifying \underline{T}raversal}), a distributed agentic Graph-RAG framework that performs progressive cross-domain retrieval guided by incremental utility goals. SCOUT-RAG employs four cooperative agents that: (i) estimate domain relevance, (ii) decide when to expand retrieval to additional domains, (iii) adapt traversal depth to avoid unnecessary graph exploration, and (iv) synthesize the high-quality answers. The framework is designed to minimize retrieval regret, defined as missing useful domain information, while controlling latency and API cost. Across multi-domain knowledge settings, SCOUT-RAG achieves performance comparable to centralized baselines, including DRIFT and exhaustive domain traversal, while substantially reducing cross-domain calls, total tokens processed, and latency.
Paper Structure (26 sections, 8 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 26 sections, 8 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Overview of the proposed SCOUT-RAG framework. The framework operates in three stages: (i) domain relevance estimation, (ii) domain-aware retrieval (local vs. global), and (iii) iterative answer refinement via synthesis and generation agents. These agentic components collectively support adaptive, cross-domain reasoning without centralized graph access.
  • Figure 2: Illustration of Stage III, where the system evaluates and improves answers through selective retrieval and synthesis. Example query: "What are the main differences in dietary habits among countries such as Japan, France, and the United States?".
  • Figure 3: Time-performance monitoring and comparison.
  • Figure 4: Case study on Italy's "Made in Italy" certification.