SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions

Saroj Mishra; Suman Niroula; Umesh Yadav; Dilip Thakur; Srijan Gyawali; Shiva Gaire

SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions

Saroj Mishra, Suman Niroula, Umesh Yadav, Dilip Thakur, Srijan Gyawali, Shiva Gaire

TL;DR

This paper formalizes agentic retrieval-generation loops as finite-horizon partially observable Markov decision processes, explicitly modeling their control policies and state transitions, and develops a comprehensive taxonomy and modular architectural decomposition that categorizes systems by their planning mechanisms, retrieval orchestration, memory paradigms, and tool-invocation behaviors.

Abstract

Retrieval-Augmented Generation (RAG) systems are increasingly evolving into agentic architectures where large language models autonomously coordinate multi-step reasoning, dynamic memory management, and iterative retrieval strategies. Despite rapid industrial adoption, current research lacks a systematic understanding of Agentic RAG as a sequential decision-making system, leading to highly fragmented architectures, inconsistent evaluation methodologies, and unresolved reliability risks. This Systematization of Knowledge (SoK) paper provides the first unified framework for understanding these autonomous systems. We formalize agentic retrieval-generation loops as finite-horizon partially observable Markov decision processes, explicitly modeling their control policies and state transitions. Building upon this formalization, we develop a comprehensive taxonomy and modular architectural decomposition that categorizes systems by their planning mechanisms, retrieval orchestration, memory paradigms, and tool-invocation behaviors. We further analyze the critical limitations of traditional static evaluation practices and identify severe systemic risks inherent to autonomous loops, including compounding hallucination propagation, memory poisoning, retrieval misalignment, and cascading tool-execution vulnerabilities. Finally, we outline key doctoral-scale research directions spanning stable adaptive retrieval, cost-aware orchestration, formal trajectory evaluation, and oversight mechanisms, providing a definitive roadmap for building reliable, controllable, and scalable agentic retrieval systems.

SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions

TL;DR

Abstract

Paper Structure (74 sections, 4 equations, 8 figures, 11 tables)

This paper contains 74 sections, 4 equations, 8 figures, 11 tables.

Introduction
Background and Foundations
Large Language Models
Retrieval-Augmented Generation
Tool-Augmented and Agentic LLMs
Multi-Hop Reasoning and Planning
Memory-Augmented Systems
From Static RAG to Agentic RAG
Limitations of Standard RAG Pipelines
Need for Iterative Retrieval
Emergence of Planning-Driven Retrieval
Formal Definition of Agentic RAG
System-Level Formalization
Necessary Properties
Distinguishing Active RAG vs Agentic RAG
...and 59 more sections

Figures (8)

Figure 1: High-level progression from single-pass retrieval-augmented generation to iterative retrieval and Agentic RAG. This demonstrates the architectural shift from static, one-shot context utilization to explicit multi-step control over retrieval, reasoning, and termination, conceptually anchoring the systematization presented in this paper.
Figure 2: The architectural evolution from static one-shot RAG pipelines to the Agentic RAG POMDP formulation. The Agentic framework replaces linear generation with a cyclic control policy ($\pi_\theta$) managing a persistent memory state ($\mathcal{M}_t$).
Figure 3: Taxonomy of Agentic RAG systems across architecture, retrieval strategy, reasoning paradigm, and memory/context management. This structural mapping demonstrates how orthogonal control-flow decisions combine to form distinct, reproducible agentic archetypes.
Figure 4: Core architectural components and control-flow relationships within a generalized Agentic RAG system. This demonstrates how the Reasoning Engine coordinates bidirectionally with Memory Systems and delegates execution to the Tool Orchestration Layer to maintain verifiable state control.
Figure 5: The closed-loop Perception-Planning-Action-Reflection (PPAR) cycle with Human-in-the-Loop (HITL) escalation. This demonstrates the structural necessity of verification loops: outputs failing constraint checks are returned as structured feedback, and unresolvable loops are escalated to prevent autonomous hallucination.
...and 3 more figures

SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions

TL;DR

Abstract

SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions

Authors

TL;DR

Abstract

Table of Contents

Figures (8)