Context Awareness Gate For Retrieval Augmented Generation
Mohammad Hassan Heydari, Arshia Hemmat, Erfan Naman, Afsaneh Fatemi
TL;DR
This work addresses the challenge that retrieval-augmented generation (RAG) often retrieves irrelevant information, degrading QA performance. It introduces the Context Awareness Gate (CAG), which dynamically adjusts LLM prompts and retrieval decisions, and Vector Candidates (VC), a statistical, LLM-free mechanism that decides whether retrieval is needed. The Context Retrieval Supervision Benchmark (CRSB) provides a broad 17-field dataset to study context-query distributions and support evaluation. Empirically, CAG outperforms baselines with higher context and answer relevancy while offering significant efficiency advantages over LLM-supervised approaches, enabling scalable open-domain QA with improved reliability.
Abstract
Retrieval Augmented Generation (RAG) has emerged as a widely adopted approach to mitigate the limitations of large language models (LLMs) in answering domain-specific questions. Previous research has predominantly focused on improving the accuracy and quality of retrieved data chunks to enhance the overall performance of the generation pipeline. However, despite ongoing advancements, the critical issue of retrieving irrelevant information -- which can impair the ability of the model to utilize its internal knowledge effectively -- has received minimal attention. In this work, we investigate the impact of retrieving irrelevant information in open-domain question answering, highlighting its significant detrimental effect on the quality of LLM outputs. To address this challenge, we propose the Context Awareness Gate (CAG) architecture, a novel mechanism that dynamically adjusts the LLMs' input prompt based on whether the user query necessitates external context retrieval. Additionally, we introduce the Vector Candidates method, a core mathematical component of CAG that is statistical, LLM-independent, and highly scalable. We further examine the distributions of relationships between contexts and questions, presenting a statistical analysis of these distributions. This analysis can be leveraged to enhance the context retrieval process in Retrieval Augmented Generation (RAG) systems.
