DAT: Dynamic Alpha Tuning for Hybrid Retrieval in Retrieval-Augmented Generation
Hsin-Ling Hsu, Jengnan Tzeng
TL;DR
DAT introduces a per-query adaptive framework for hybrid retrieval in Retrieval-Augmented Generation by using an LLM to score the top-1 results from sparse and dense retrieval, then computing a dynamic weighting $α(q)$ to fuse scores. This approach replaces static offline tuning with query-aware calibration, improving Precision@1 and MRR@20 across English and Chinese benchmark datasets and across multiple model sizes. The key contributions include a lightweight LLM-based effectiveness scoring rubric, a deterministic rule-based $α(q)$ computation, and empirical evidence showing strong gains on hybrid-sensitive queries and reduced performance variance. The method offers practical impact by delivering more accurate and consistent retrieval while maintaining efficiency through sampling only the top results from each method.
Abstract
Hybrid retrieval techniques in Retrieval-Augmented Generation (RAG) systems enhance information retrieval by combining dense and sparse (e.g., BM25-based) retrieval methods. However, existing approaches struggle with adaptability, as fixed weighting schemes fail to adjust to different queries. To address this, we propose DAT (Dynamic Alpha Tuning), a novel hybrid retrieval framework that dynamically balances dense retrieval and BM25 for each query. DAT leverages a large language model (LLM) to evaluate the effectiveness of the top-1 results from both retrieval methods, assigning an effectiveness score to each. It then calibrates the optimal weighting factor through effectiveness score normalization, ensuring a more adaptive and query-aware weighting between the two approaches. Empirical results show that DAT consistently significantly outperforms fixed-weighting hybrid retrieval methods across various evaluation metrics. Even on smaller models, DAT delivers strong performance, highlighting its efficiency and adaptability.
