Table of Contents
Fetching ...

IterQR: An Iterative Framework for LLM-based Query Rewrite in e-Commercial Search System

Shangyu Chen, Xinyu Jia, Yingfei Zhang, Shuai Zhang, Xiang Li, Wei Lin

TL;DR

IterQR introduces an iterative, LLM-based framework for query rewriting in e-commerce search that tightly integrates domain knowledge, user interactions, and continual learning. Each iteration comprises rewrite generation with Chain-of-Thought reasoning and Retrieval-Augmented Generation, online signal collection to label positive rewrites, and multi-task post-training to improve rewriting capabilities. The approach enables dynamic updates to the rewrite vocabulary and achieves gains in online metrics and offline retrieval efficiency, demonstrated on Meituan Delivery's search system. The work shows that combining CoT, RAG, and online feedback with targeted post-training yields robust, domain-specific rewrites and practical improvements in real-world, high-traffic search environments.

Abstract

The essence of modern e-Commercial search system lies in matching user's intent and available candidates depending on user's query, providing personalized and precise service. However, user's query may be incorrect due to ambiguous input and typo, leading to inaccurate search. These cases may be released by query rewrite: modify query to other representation or expansion. However, traditional query rewrite replies on static rewrite vocabulary, which is manually established meanwhile lacks interaction with both domain knowledge in e-Commercial system and common knowledge in the real world. In this paper, with the ability to generate text content of Large Language Models (LLMs), we provide an iterative framework to generate query rewrite. The framework incorporates a 3-stage procedure in each iteration: Rewrite Generation with domain knowledge by Retrieval-Augmented Generation (RAG) and query understanding by Chain-of-Thoughts (CoT); Online Signal Collection with automatic positive rewrite update; Post-training of LLM with multi task objective to generate new rewrites. Our work (named as IterQR) provides a comprehensive framework to generate \textbf{Q}uery \textbf{R}ewrite with both domain / real-world knowledge. It automatically update and self-correct the rewrites during \textbf{iter}ations. \method{} has been deployed in Meituan Delivery's search system (China's leading food delivery platform), providing service for users with significant improvement.

IterQR: An Iterative Framework for LLM-based Query Rewrite in e-Commercial Search System

TL;DR

IterQR introduces an iterative, LLM-based framework for query rewriting in e-commerce search that tightly integrates domain knowledge, user interactions, and continual learning. Each iteration comprises rewrite generation with Chain-of-Thought reasoning and Retrieval-Augmented Generation, online signal collection to label positive rewrites, and multi-task post-training to improve rewriting capabilities. The approach enables dynamic updates to the rewrite vocabulary and achieves gains in online metrics and offline retrieval efficiency, demonstrated on Meituan Delivery's search system. The work shows that combining CoT, RAG, and online feedback with targeted post-training yields robust, domain-specific rewrites and practical improvements in real-world, high-traffic search environments.

Abstract

The essence of modern e-Commercial search system lies in matching user's intent and available candidates depending on user's query, providing personalized and precise service. However, user's query may be incorrect due to ambiguous input and typo, leading to inaccurate search. These cases may be released by query rewrite: modify query to other representation or expansion. However, traditional query rewrite replies on static rewrite vocabulary, which is manually established meanwhile lacks interaction with both domain knowledge in e-Commercial system and common knowledge in the real world. In this paper, with the ability to generate text content of Large Language Models (LLMs), we provide an iterative framework to generate query rewrite. The framework incorporates a 3-stage procedure in each iteration: Rewrite Generation with domain knowledge by Retrieval-Augmented Generation (RAG) and query understanding by Chain-of-Thoughts (CoT); Online Signal Collection with automatic positive rewrite update; Post-training of LLM with multi task objective to generate new rewrites. Our work (named as IterQR) provides a comprehensive framework to generate \textbf{Q}uery \textbf{R}ewrite with both domain / real-world knowledge. It automatically update and self-correct the rewrites during \textbf{iter}ations. \method{} has been deployed in Meituan Delivery's search system (China's leading food delivery platform), providing service for users with significant improvement.

Paper Structure

This paper contains 29 sections, 2 figures, 5 tables, 1 algorithm.

Figures (2)

  • Figure 1: Workflow of IterQR: An iterative framework of 3 stages. Stage 1 initialize and generate query rewrite based on query. Prompt is designed to incorporate domain knowledge (associated interacted restaurant and cuisine. Besides, query rewrite is formulated as query understanding and process by CoT. The generated rewrites are fed to Stage 2 for online feedback collection, where the positive rewrites are utilized in Stage 3. LLM is post trained with multi-task objectives, using the positive rewrites as labels. The trained model serves to motivate new rewrites in Stage 1, prompting the continuous iteration.
  • Figure 2: Portion of unique rewrites generated by IterQR during iterations