Large Search Model: Redefining Search Stack in the Era of LLMs
Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei
TL;DR
The paper addresses the fragmentation of modern search stacks by proposing a Large Search Model (LSM): a customized large language model that unifies information retrieval tasks into autoregressive text generation guided by natural language prompts. This approach aims to improve generalization and reduce system complexity by handling diverse IR tasks within a single model, while acknowledging significant challenges in inference cost, long-context modeling, and responsible AI alignment. A proof-of-concept using LLaMA-7B on MS MARCO demonstrates competitive performance in listwise ranking and retrieval-augmented answer generation, with context length extended to $16k$ tokens. The work outlines practical deployment considerations, including efficiency techniques and mitigation of hallucinations, and calls for future benchmarks and research to scale and validate the unified framework across real-world search systems. Overall, the paper offers a roadmap for leveraging multi-modal LLMs and prompt-driven task customization to rethink the information retrieval stack in the era of LLMs.
Abstract
Modern search engines are built on a stack of different components, including query understanding, retrieval, multi-stage ranking, and question answering, among others. These components are often optimized and deployed independently. In this paper, we introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM). All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts. This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack. To substantiate the feasibility of this framework, we present a series of proof-of-concept experiments and discuss the potential challenges associated with implementing this approach within real-world search systems.
