ERATTA: Extreme RAG for Table To Answers with Large Language Models

Sohini Roychowdhury; Marko Krema; Anvar Mahammad; Brian Moore; Arijit Mukherjee; Punit Prakashchandra

ERATTA: Extreme RAG for Table To Answers with Large Language Models

Sohini Roychowdhury, Marko Krema, Anvar Mahammad, Brian Moore, Arijit Mukherjee, Punit Prakashchandra

TL;DR

This work proposes a unique LLM-based system where multiple LLMs can be invoked to enable data authentication, user-query routing, data-retrieval and custom prompting for question-answering capabilities from Enterprise-data tables.

Abstract

Large language models (LLMs) with retrieval augmented-generation (RAG) have been the optimal choice for scalable generative AI solutions in the recent past. Although RAG implemented with AI agents (agentic-RAG) has been recently popularized, its suffers from unstable cost and unreliable performances for Enterprise-level data-practices. Most existing use-cases that incorporate RAG with LLMs have been either generic or extremely domain specific, thereby questioning the scalability and generalizability of RAG-LLM approaches. In this work, we propose a unique LLM-based system where multiple LLMs can be invoked to enable data authentication, user-query routing, data-retrieval and custom prompting for question-answering capabilities from Enterprise-data tables. The source tables here are highly fluctuating and large in size and the proposed framework enables structured responses in under 10 seconds per query. Additionally, we propose a five metric scoring module that detects and reports hallucinations in the LLM responses. Our proposed system and scoring metrics achieve >90% confidence scores across hundreds of user queries in the sustainability, financial health and social media domains. Extensions to the proposed extreme RAG architectures can enable heterogeneous source querying using LLMs.

ERATTA: Extreme RAG for Table To Answers with Large Language Models

TL;DR

Abstract

Paper Structure (15 sections, 1 equation, 5 figures, 4 tables)

This paper contains 15 sections, 1 equation, 5 figures, 4 tables.

Introduction
Related Work
Data and Methods
Extreme-RAG System Components
Authentication RAG
Prompt 1, Routing a User-query
Prompt 2, Data Retrieval
Prompt 3, Answer Retrieval
Quality Assurance
Experiments and Results
Qualitative Scalability Analysis
Quantitative Response Analysis
Extreme-RAG Extensions
Comparison with Agentic-RAG
Conclusion

Figures (5)

Figure 1: The proposed extreme-RAG system architecture that invokes 3-4 LLM prompts per user query to authenticate, route, extract smaller tabular data and fetch the required response from fast-varying and large tabular sustainability data sources.
Figure 2: Examples of class definitions to extract tabular metadata and enable scaling prompt 2 across source tables and user query types.
Figure 3: Example of the steps in Prompts 1, 2 and 3 to retrieve an appropriate response from the sustainability data tables.
Figure 4: Averaged response quality scores on sustainability dataset [s$_1$ to s$_5$]
Figure 5: Example of an extension of the proposed Extreme RAG system to solve for predictive and prescriptive trend questions using back-end optimization algorithms that can be invoked by LLM prompts (prompt 2).

ERATTA: Extreme RAG for Table To Answers with Large Language Models

TL;DR

Abstract

ERATTA: Extreme RAG for Table To Answers with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (5)