Table of Contents
Fetching ...

Think-then-Act: A Dual-Angle Evaluated Retrieval-Augmented Generation

Yige Shen, Hao Jiang, Hua Qu, Jihong Zhao

TL;DR

This work tackles the reliability and cost of retrieval-augmented generation by introducing Think-then-Act, a dual-phase framework that first judges query clarity and completeness and then assesses the model's ability to answer before triggering retrieval. By combining query rewriting only when necessary with a confidence-based retrieval decision, the approach reduces unnecessary API calls while maintaining or improving accuracy across five diverse tasks, including English and Chinese domains. Key findings show superior performance over baselines, an optimal confidence threshold around 0.5, and significant efficiency gains, especially in Chinese Poetry domain where retrieval was used for only 36.8% of queries. The method offers a practical pathway to more cost-efficient, accurate LLM applications in real-world, multilingual settings.

Abstract

Despite their impressive capabilities, large language models (LLMs) often face challenges such as temporal misalignment and generating hallucinatory content. Enhancing LLMs with retrieval mechanisms to fetch relevant information from external sources offers a promising solution. Inspired by the proverb "Think twice before you act," we propose a dual-angle evaluated retrieval-augmented generation framework \textit{Think-then-Act}. Unlike previous approaches that indiscriminately rewrite queries or perform retrieval regardless of necessity, or generate temporary responses before deciding on additional retrieval, which increases model generation costs, our framework employs a two-phase process: (i) assessing the input query for clarity and completeness to determine if rewriting is necessary; and (ii) evaluating the model's capability to answer the query and deciding if additional retrieval is needed. Experimental results on five datasets show that the \textit{Think-then-Act} framework significantly improves performance. Our framework demonstrates notable improvements in accuracy and efficiency compared to existing baselines and performs well in both English and non-English contexts. Ablation studies validate the optimal model confidence threshold, highlighting the resource optimization benefits of our approach.

Think-then-Act: A Dual-Angle Evaluated Retrieval-Augmented Generation

TL;DR

This work tackles the reliability and cost of retrieval-augmented generation by introducing Think-then-Act, a dual-phase framework that first judges query clarity and completeness and then assesses the model's ability to answer before triggering retrieval. By combining query rewriting only when necessary with a confidence-based retrieval decision, the approach reduces unnecessary API calls while maintaining or improving accuracy across five diverse tasks, including English and Chinese domains. Key findings show superior performance over baselines, an optimal confidence threshold around 0.5, and significant efficiency gains, especially in Chinese Poetry domain where retrieval was used for only 36.8% of queries. The method offers a practical pathway to more cost-efficient, accurate LLM applications in real-world, multilingual settings.

Abstract

Despite their impressive capabilities, large language models (LLMs) often face challenges such as temporal misalignment and generating hallucinatory content. Enhancing LLMs with retrieval mechanisms to fetch relevant information from external sources offers a promising solution. Inspired by the proverb "Think twice before you act," we propose a dual-angle evaluated retrieval-augmented generation framework \textit{Think-then-Act}. Unlike previous approaches that indiscriminately rewrite queries or perform retrieval regardless of necessity, or generate temporary responses before deciding on additional retrieval, which increases model generation costs, our framework employs a two-phase process: (i) assessing the input query for clarity and completeness to determine if rewriting is necessary; and (ii) evaluating the model's capability to answer the query and deciding if additional retrieval is needed. Experimental results on five datasets show that the \textit{Think-then-Act} framework significantly improves performance. Our framework demonstrates notable improvements in accuracy and efficiency compared to existing baselines and performs well in both English and non-English contexts. Ablation studies validate the optimal model confidence threshold, highlighting the resource optimization benefits of our approach.
Paper Structure (24 sections, 2 equations, 5 figures, 4 tables)

This paper contains 24 sections, 2 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Think-then-Act: (i) assessing the input query for clarity and completeness to determine if rewriting is necessary; (ii) evaluating the model's capability to answer the query and deciding if additional retrieval is needed.
  • Figure 2: Overall results using the EM metric.Think-then-Act parameter $\beta'=0.5$.
  • Figure 3: Comparison of Think-then-Act and Retrieval-then-Read($\beta'=0.5$) on the Chinese Poetry dataset: generation accuracy(blue) and retrieval ratio(red).
  • Figure 4: the Exact Match (EM) scores across various datasets with different $\beta'$ values
  • Figure 5: Comparison of Think-then-Act($\beta'=0.5$, LM postprocessing) and FLARE(LM preprocessing) on the Chinese Poetry dataset: generation accuracy(blue) and retrieval ratio(red).