Table of Contents
Fetching ...

A Survey of Query Optimization in Large Language Models

Mingyang Song, Mao Zheng

TL;DR

The paper addresses the problem of improving query understanding and evidence sourcing in Retrieval-Augmented Generation by surveying query optimization techniques. It introduces a structured taxonomy—Expansion, Disambiguation, Decomposition, and Abstraction—and classifies queries by evidence needs to map to appropriate optimization operations. The authors compile a broad range of methods (internal/external expansion, decomposition pipelines, disambiguation strategies, and abstraction-driven reasoning) and discuss challenges such as process reward modeling, benchmarking, efficiency, and post-performance alignment. The work provides a foundational framework to guide future research and practical deployment of QO in real-world, retrieval-augmented LLM applications.

Abstract

\textit{Query Optimization} (QO) refers to techniques aimed at enhancing the efficiency and quality of Large Language Models (LLMs) in understanding and answering queries, especially complex ones in scenarios like Retrieval-Augmented Generation (RAG). Specifically, RAG mitigates the limitations of LLMs by dynamically retrieving and leveraging up-to-date relevant information, which provides a cost-effective solution to the challenge of LLMs producing plausible but potentially inaccurate responses. Recently, as RAG evolves and incorporates multiple components that influence its performance, QO has emerged as a critical element, playing a pivotal role in determining the effectiveness of RAG's retrieval stage in accurately sourcing the necessary multiple pieces of evidence to answer queries correctly. In this paper, we trace the evolution of QO techniques by summarizing and analyzing significant studies. Through an organized framework and categorization, we aim to consolidate existing QO techniques in RAG, elucidate their technological foundations, and highlight their potential to enhance the versatility and applications of LLMs.

A Survey of Query Optimization in Large Language Models

TL;DR

The paper addresses the problem of improving query understanding and evidence sourcing in Retrieval-Augmented Generation by surveying query optimization techniques. It introduces a structured taxonomy—Expansion, Disambiguation, Decomposition, and Abstraction—and classifies queries by evidence needs to map to appropriate optimization operations. The authors compile a broad range of methods (internal/external expansion, decomposition pipelines, disambiguation strategies, and abstraction-driven reasoning) and discuss challenges such as process reward modeling, benchmarking, efficiency, and post-performance alignment. The work provides a foundational framework to guide future research and practical deployment of QO in real-world, retrieval-augmented LLM applications.

Abstract

\textit{Query Optimization} (QO) refers to techniques aimed at enhancing the efficiency and quality of Large Language Models (LLMs) in understanding and answering queries, especially complex ones in scenarios like Retrieval-Augmented Generation (RAG). Specifically, RAG mitigates the limitations of LLMs by dynamically retrieving and leveraging up-to-date relevant information, which provides a cost-effective solution to the challenge of LLMs producing plausible but potentially inaccurate responses. Recently, as RAG evolves and incorporates multiple components that influence its performance, QO has emerged as a critical element, playing a pivotal role in determining the effectiveness of RAG's retrieval stage in accurately sourcing the necessary multiple pieces of evidence to answer queries correctly. In this paper, we trace the evolution of QO techniques by summarizing and analyzing significant studies. Through an organized framework and categorization, we aim to consolidate existing QO techniques in RAG, elucidate their technological foundations, and highlight their potential to enhance the versatility and applications of LLMs.

Paper Structure

This paper contains 15 sections, 3 figures.

Figures (3)

  • Figure 1: Four atomic operations in QO.
  • Figure 2: Classification of query optimization techniques in detail.
  • Figure 3: Taxonomy tree of core techniques of query optimization.