Table of Contents
Fetching ...

Large Search Model: Redefining Search Stack in the Era of LLMs

Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei

TL;DR

The paper addresses the fragmentation of modern search stacks by proposing a Large Search Model (LSM): a customized large language model that unifies information retrieval tasks into autoregressive text generation guided by natural language prompts. This approach aims to improve generalization and reduce system complexity by handling diverse IR tasks within a single model, while acknowledging significant challenges in inference cost, long-context modeling, and responsible AI alignment. A proof-of-concept using LLaMA-7B on MS MARCO demonstrates competitive performance in listwise ranking and retrieval-augmented answer generation, with context length extended to $16k$ tokens. The work outlines practical deployment considerations, including efficiency techniques and mitigation of hallucinations, and calls for future benchmarks and research to scale and validate the unified framework across real-world search systems. Overall, the paper offers a roadmap for leveraging multi-modal LLMs and prompt-driven task customization to rethink the information retrieval stack in the era of LLMs.

Abstract

Modern search engines are built on a stack of different components, including query understanding, retrieval, multi-stage ranking, and question answering, among others. These components are often optimized and deployed independently. In this paper, we introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM). All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts. This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack. To substantiate the feasibility of this framework, we present a series of proof-of-concept experiments and discuss the potential challenges associated with implementing this approach within real-world search systems.

Large Search Model: Redefining Search Stack in the Era of LLMs

TL;DR

The paper addresses the fragmentation of modern search stacks by proposing a Large Search Model (LSM): a customized large language model that unifies information retrieval tasks into autoregressive text generation guided by natural language prompts. This approach aims to improve generalization and reduce system complexity by handling diverse IR tasks within a single model, while acknowledging significant challenges in inference cost, long-context modeling, and responsible AI alignment. A proof-of-concept using LLaMA-7B on MS MARCO demonstrates competitive performance in listwise ranking and retrieval-augmented answer generation, with context length extended to tokens. The work outlines practical deployment considerations, including efficiency techniques and mitigation of hallucinations, and calls for future benchmarks and research to scale and validate the unified framework across real-world search systems. Overall, the paper offers a roadmap for leveraging multi-modal LLMs and prompt-driven task customization to rethink the information retrieval stack in the era of LLMs.

Abstract

Modern search engines are built on a stack of different components, including query understanding, retrieval, multi-stage ranking, and question answering, among others. These components are often optimized and deployed independently. In this paper, we introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM). All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts. This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack. To substantiate the feasibility of this framework, we present a series of proof-of-concept experiments and discuss the potential challenges associated with implementing this approach within real-world search systems.
Paper Structure (13 sections, 1 figure, 4 tables)

This paper contains 13 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Comparison of the conventional search stack and our proposed large search model. The conventional search stack comprises a cascading retrieval and ranking pipeline, along with many other components to generate the Search Engine Result Page (SERP). In contrast, our proposed framework employs a unified modeling approach, where prompts are utilized to customize the large search model for diverse search tasks. It is worth mentioning that the figure presented herein is for illustrative purposes only and does not correspond to any specific implementation of modern search engines. MLLM stands for Multi-modal Large Language Models.