Table of Contents
Fetching ...

Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs

Xiaqiang Tang, Jian Li, Nan Du, Sihong Xie

TL;DR

This work introduces a Multi-Arm Bandit enhanced Retrieval-Augmented Generation framework for knowledge-graph-based QA, addressing non-stationary real-world environments by dynamically selecting among multiple retrieval methods using real-time feedback. It combines a DistilBERT-based query encoder, an epsilon-greedy arm selector, offline-to-online learning, and a Generalized Gini Index to balance multi-objective rewards such as accuracy and retrieval latency. Across two KBQA datasets, the proposed GGIMAB approach outperforms baselines in non-stationary settings and achieves state-of-the-art performance in stationary settings, demonstrating strong adaptability to backend upgrades and domain shifts. The results highlight the practical value of continuously adapting retrieval strategies in RAG systems to maintain informative and timely responses in dynamic environments.

Abstract

Despite the superior performance of Large language models on many NLP tasks, they still face significant limitations in memorizing extensive world knowledge. Recent studies have demonstrated that leveraging the Retrieval-Augmented Generation (RAG) framework, combined with Knowledge Graphs that encapsulate extensive factual data in a structured format, robustly enhances the reasoning capabilities of LLMs. However, deploying such systems in real-world scenarios presents challenges: the continuous evolution of non-stationary environments may lead to performance degradation and user satisfaction requires a careful balance of performance and responsiveness. To address these challenges, we introduce a Multi-objective Multi-Armed Bandit enhanced RAG framework, supported by multiple retrieval methods with diverse capabilities under rich and evolving retrieval contexts in practice. Within this framework, each retrieval method is treated as a distinct ``arm''. The system utilizes real-time user feedback to adapt to dynamic environments, by selecting the appropriate retrieval method based on input queries and the historical multi-objective performance of each arm. Extensive experiments conducted on two benchmark KGQA datasets demonstrate that our method significantly outperforms baseline methods in non-stationary settings while achieving state-of-the-art performance in stationary environments. Code and data are available at https://github.com/FUTUREEEEEE/Dynamic-RAG.git

Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphs

TL;DR

This work introduces a Multi-Arm Bandit enhanced Retrieval-Augmented Generation framework for knowledge-graph-based QA, addressing non-stationary real-world environments by dynamically selecting among multiple retrieval methods using real-time feedback. It combines a DistilBERT-based query encoder, an epsilon-greedy arm selector, offline-to-online learning, and a Generalized Gini Index to balance multi-objective rewards such as accuracy and retrieval latency. Across two KBQA datasets, the proposed GGIMAB approach outperforms baselines in non-stationary settings and achieves state-of-the-art performance in stationary settings, demonstrating strong adaptability to backend upgrades and domain shifts. The results highlight the practical value of continuously adapting retrieval strategies in RAG systems to maintain informative and timely responses in dynamic environments.

Abstract

Despite the superior performance of Large language models on many NLP tasks, they still face significant limitations in memorizing extensive world knowledge. Recent studies have demonstrated that leveraging the Retrieval-Augmented Generation (RAG) framework, combined with Knowledge Graphs that encapsulate extensive factual data in a structured format, robustly enhances the reasoning capabilities of LLMs. However, deploying such systems in real-world scenarios presents challenges: the continuous evolution of non-stationary environments may lead to performance degradation and user satisfaction requires a careful balance of performance and responsiveness. To address these challenges, we introduce a Multi-objective Multi-Armed Bandit enhanced RAG framework, supported by multiple retrieval methods with diverse capabilities under rich and evolving retrieval contexts in practice. Within this framework, each retrieval method is treated as a distinct ``arm''. The system utilizes real-time user feedback to adapt to dynamic environments, by selecting the appropriate retrieval method based on input queries and the historical multi-objective performance of each arm. Extensive experiments conducted on two benchmark KGQA datasets demonstrate that our method significantly outperforms baseline methods in non-stationary settings while achieving state-of-the-art performance in stationary environments. Code and data are available at https://github.com/FUTUREEEEEE/Dynamic-RAG.git

Paper Structure

This paper contains 21 sections, 4 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: An online KG-based RAG system facing challenges from non-stationary environments and the need to balance multiple objectives for optimal user experience.
  • Figure 2: Proposed MAB-enhanced RAG framework. The input query undergoes feature extraction (e.g., multi-entity query), followed by the MAB algorithm, which selects the optimal retrieval method by predicting the most rewarding option (e.g., Query Language method). The selected method retrieves information from a Knowledge Graph (KG), and an LLM generates the final response. Feedback is collected as a reward, updating the MAB model parameters online, and enabling continuous adaptation to non-stationary environments.
  • Figure 3: Comparison of retrieval methods for the query, "What are some books that Mark Twain wrote?" Dense Retrieval is fast but has low recall, while KG-Agent-Retriever provides broad coverage but is slow. Our system selects the SPARQL-Retriever chatkbqa, which generates an accurate search language command for precise and efficient results.
  • Figure 4: Confusion matrices comparing retrieval methods (Decaf, ChatKBQA, BGE, RoG) on WebQSP and CWQ datasets, indicating distinctiveness among methods.
  • Figure 5: MAB enhanced RAG systems with LLM variants under stationary environments
  • ...and 2 more figures