AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation

Jia Fu; Xiaoting Qin; Fangkai Yang; Lu Wang; Jue Zhang; Qingwei Lin; Yubo Chen; Dongmei Zhang; Saravan Rajmohan; Qi Zhang

AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation

Jia Fu, Xiaoting Qin, Fangkai Yang, Lu Wang, Jue Zhang, Qingwei Lin, Yubo Chen, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

TL;DR

The AutoRAG-HP framework is proposed, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem and introduces a novel two-level Hierarchical MAB (Hier-MAB) method for efficient exploration of large search spaces.

Abstract

Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem and introduces a novel two-level Hierarchical MAB (Hier-MAB) method for efficient exploration of large search spaces. We conduct extensive experiments on tuning hyper-parameters, such as top-k retrieved documents, prompt compression ratio, and embedding methods, using the ALCE-ASQA and Natural Questions datasets. Our evaluation from jointly optimization all three hyper-parameters demonstrate that MAB-based online learning methods can achieve Recall@5 $\approx 0.8$ for scenarios with prominent gradients in search space, using only $\sim20\%$ of the LLM API calls required by the Grid Search approach. Additionally, the proposed Hier-MAB approach outperforms other baselines in more challenging optimization scenarios. The code will be made available at https://aka.ms/autorag.

AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation

TL;DR

Abstract

for scenarios with prominent gradients in search space, using only

of the LLM API calls required by the Grid Search approach. Additionally, the proposed Hier-MAB approach outperforms other baselines in more challenging optimization scenarios. The code will be made available at https://aka.ms/autorag.

Paper Structure (18 sections, 2 equations, 14 figures, 3 tables)

This paper contains 18 sections, 2 equations, 14 figures, 3 tables.

Introduction
Related Work
AutoML and LLMs
Hyper-parameter Optimization
Hyper-parameter Tuning in RAG
Methodology
Problem Formulation
Two-level Hierarchical MAB
Evaluation
Experiment Setup
Experiment Result
Ablation Study
Case Study: Upgrade Base LLM from GPT-3.5-Turbo to GPT-4
Discussion
Summary
...and 3 more sections

Figures (14)

Figure 1: A RAG system with tunable hyper-parameters.
Figure 2: An example of two-level hierarchical MAB.
Figure 3: Grid search results for ASQA with GPT-4. Error bars represent the standard deviations of accuracy and reward values across all batches.
Figure 4: Evolution of Recall@3 in the optimization of $(\mathcal{K}, \mathcal{C})$ for the GPT-4 case.
Figure 5: Evolution of Recall@5 in the optimization of $(\mathcal{K}, \mathcal{C}, \mathcal{E})$ for the GPT-4 case.
...and 9 more figures

AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation

TL;DR

Abstract

AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (14)