Keyword search is all you need: Achieving RAG-Level Performance without vector databases using agentic tool use

Shreyas Subramanian; Adewale Akinfaderin; Yanyan Zhang; Ishan Singh; Mani Khanuja; Sandeep Singh; Maira Ladeira Tanke

Keyword search is all you need: Achieving RAG-Level Performance without vector databases using agentic tool use

Shreyas Subramanian, Adewale Akinfaderin, Yanyan Zhang, Ishan Singh, Mani Khanuja, Sandeep Singh, Maira Ladeira Tanke

TL;DR

This study conducted a systematic comparison between RAG-based systems and tool-augmented LLM agents, specifically evaluating their retrieval mechanisms and response quality when the agent only has access to basic keyword search tools.

Abstract

While Retrieval-Augmented Generation (RAG) has proven effective for generating accurate, context-based responses based on existing knowledge bases, it presents several challenges including retrieval quality dependencies, integration complexity and cost. Recent advances in agentic-RAG and tool-augmented LLM architectures have introduced alternative approaches to information retrieval and processing. We question how much additional value vector databases and semantic search bring to RAG over simple, agentic keyword search in documents for question-answering. In this study, we conducted a systematic comparison between RAG-based systems and tool-augmented LLM agents, specifically evaluating their retrieval mechanisms and response quality when the agent only has access to basic keyword search tools. Our empirical analysis demonstrates that tool-based keyword search implementations within an agentic framework can attain over $90\%$ of the performance metrics compared to traditional RAG systems without using a standing vector database. Our approach is simple to implement, cost effective, and is particularly useful in scenarios requiring frequent updates to knowledge bases.

Keyword search is all you need: Achieving RAG-Level Performance without vector databases using agentic tool use

TL;DR

Abstract

of the performance metrics compared to traditional RAG systems without using a standing vector database. Our approach is simple to implement, cost effective, and is particularly useful in scenarios requiring frequent updates to knowledge bases.

Paper Structure (13 sections, 2 figures, 4 tables, 1 algorithm)

This paper contains 13 sections, 2 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Methodology
Datasets
Experiment 1: Baseline RAG Implementation
Experiment 2: Agentic Search Framework
Evaluation Methodology
Results
Agentic keyword search vs. Claude computer use
Conclusion
Agent terminal tool instructions
Example detailed agent run
Computer Use Agent Interactions

Figures (2)

Figure 1: Comparison between RAG (red) and agent-based (blue) pipelines for document QnA
Figure 2: Coverage comparison of Tool-Augmented Agent vs RAG metrics across the BlockchainSolana and LLM Survey Paper datasets

Keyword search is all you need: Achieving RAG-Level Performance without vector databases using agentic tool use

TL;DR

Abstract

Keyword search is all you need: Achieving RAG-Level Performance without vector databases using agentic tool use

Authors

TL;DR

Abstract

Table of Contents

Figures (2)