MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion

Pengyue Jia; Yiding Liu; Xiangyu Zhao; Xiaopeng Li; Changying Hao; Shuaiqiang Wang; Dawei Yin

MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion

Pengyue Jia, Yiding Liu, Xiangyu Zhao, Xiaopeng Li, Changying Hao, Shuaiqiang Wang, Dawei Yin

TL;DR

This work designs a query-query-document generation method, leveraging LLMs’ zero-shot reasoning ability to produce diverse sub-queries and corresponding documents, and proposes a mutual verification process that synergizes generated and retrieved documents for optimal expansion.

Abstract

Query expansion, pivotal in search engines, enhances the representation of user information needs with additional terms. While existing methods expand queries using retrieved or generated contextual documents, each approach has notable limitations. Retrieval-based methods often fail to accurately capture search intent, particularly with brief or ambiguous queries. Generation-based methods, utilizing large language models (LLMs), generally lack corpus-specific knowledge and entail high fine-tuning costs. To address these gaps, we propose a novel zero-shot query expansion framework utilizing LLMs for mutual verification. Specifically, we first design a query-query-document generation method, leveraging LLMs' zero-shot reasoning ability to produce diverse sub-queries and corresponding documents. Then, a mutual verification process synergizes generated and retrieved documents for optimal expansion. Our proposed method is fully zero-shot, and extensive experiments on three public benchmark datasets are conducted to demonstrate its effectiveness over existing methods. Our code is available online at https://github.com/Applied-Machine-Learning-Lab/MILL to ease reproduction.

MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion

TL;DR

Abstract

Paper Structure (25 sections, 6 equations, 7 figures, 19 tables)

This paper contains 25 sections, 6 equations, 7 figures, 19 tables.

Introduction
Problem Definition
Methodology
Overview
Query-Query-Document Generation
Mutual Verification
Query Expansion for Retrieval
Experiments
Datasets and Metrics
Baselines
Implementation Details
Main Results
Ablation Study
Varying the Number of Documents
Case Study
...and 10 more sections

Figures (7)

Figure 1: Overview of MILL.
Figure 2: Query-query-document prompt compared to Query2Term, CoT, and Query2Doc. Query-query-document instructs the LLM to expand the original query from multiple perspectives by inferring the sub-queries and generating corresponding contextual documents.
Figure 3: Varying the number of candidate and selected documents on TREC-COVID.
Figure 4: Hyperparameter analysis on the number of document selections on TREC-COVID. The x-axis denotes the number of documents selected, and the y-axis represents the metrics values (NDCG@1000, AP@1000, Recall@1000, and MRR@1000).
Figure 5: Hyperparameter analysis on the number of document selections on TREC-DL-2020. The x-axis denotes the number of documents selected, and the y-axis represents the metrics values (NDCG@1000, AP@1000, Recall@1000, and MRR@1000).
...and 2 more figures

MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion

TL;DR

Abstract

MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion

Authors

TL;DR

Abstract

Table of Contents

Figures (7)