RAG based Question-Answering for Contextual Response Prediction System
Sriram Veturi, Saurabh Vaichal, Reshma Lal Jagadheesh, Nafis Irtiza Tripto, Nian Yan
TL;DR
This work tackles the challenge of grounding large language models in industry-specific knowledge to prevent hallucinations in customer-service contexts. It proposes an end-to-end Retrieval Augmented Generation framework for a Response Prediction System (RPS) deployed in a major retailer's contact centers, combining knowledge-base retrieval with LLM generation and agent history. Through extensive automated and human evaluations, it demonstrates that RAG-based LLMs achieve higher accuracy, alignment, and semantic coherence than a BERT-based baseline, while highlighting latency considerations for real-time deployment when using ReAct and advanced prompting. The findings support the practical viability of RAG-LLMs for knowledge-grounded agent assistance and suggest future work on broader LLM comparisons, query rewriting, and multi-source RAG integration.
Abstract
Large Language Models (LLMs) have shown versatility in various Natural Language Processing (NLP) tasks, including their potential as effective question-answering systems. However, to provide precise and relevant information in response to specific customer queries in industry settings, LLMs require access to a comprehensive knowledge base to avoid hallucinations. Retrieval Augmented Generation (RAG) emerges as a promising technique to address this challenge. Yet, developing an accurate question-answering framework for real-world applications using RAG entails several challenges: 1) data availability issues, 2) evaluating the quality of generated content, and 3) the costly nature of human evaluation. In this paper, we introduce an end-to-end framework that employs LLMs with RAG capabilities for industry use cases. Given a customer query, the proposed system retrieves relevant knowledge documents and leverages them, along with previous chat history, to generate response suggestions for customer service agents in the contact centers of a major retail company. Through comprehensive automated and human evaluations, we show that this solution outperforms the current BERT-based algorithms in accuracy and relevance. Our findings suggest that RAG-based LLMs can be an excellent support to human customer service representatives by lightening their workload.
