Table of Contents
Fetching ...

Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models

Yuntao Gui, James Cheng

TL;DR

Search-R3 presents a framework that unifies large language model reasoning with embedding generation for retrieval. It introduces a two-stage training pipeline—instruction-guided representation learning to generate embeddings via a dedicated token, and reinforcement learning to optimize reasoning and embeddings within an end-to-end retrieval loop. A scalable RL environment with selective local graph refresh enables training at scale by updating only affected embedding regions. Across diverse benchmarks, Search-R3 achieves state-of-the-art retrieval performance when reasoning is enabled, demonstrating the advantage of integrating reasoning and semantic representation.

Abstract

Despite their remarkable natural language understanding capabilities, Large Language Models (LLMs) have been underutilized for retrieval tasks. We present Search-R3, a novel framework that addresses this limitation by adapting LLMs to generate search embeddings as a direct output of their reasoning process. Our approach exploits LLMs' chain-of-thought capabilities, allowing them to produce more effective embeddings by reasoning step-by-step through complex semantic analyses. We implement this through three complementary mechanisms. (1) a supervised learning stage enables the model's ability to produce quality embeddings, (2) a reinforcement learning (RL) methodology that optimizes embedding generation alongside reasoning, and (3) a specialized RL environment that efficiently handles evolving embedding representations without requiring complete corpus re-encoding at each training iteration. Our extensive evaluations on diverse benchmarks demonstrate that Search-R3 significantly outperforms prior methods by unifying the reasoning and embedding generation processes. This integrated post-training approach represents a substantial advancement in handling complex knowledge-intensive tasks that require both sophisticated reasoning and effective information retrieval. Project page: https://github.com/ytgui/Search-R3

Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models

TL;DR

Search-R3 presents a framework that unifies large language model reasoning with embedding generation for retrieval. It introduces a two-stage training pipeline—instruction-guided representation learning to generate embeddings via a dedicated token, and reinforcement learning to optimize reasoning and embeddings within an end-to-end retrieval loop. A scalable RL environment with selective local graph refresh enables training at scale by updating only affected embedding regions. Across diverse benchmarks, Search-R3 achieves state-of-the-art retrieval performance when reasoning is enabled, demonstrating the advantage of integrating reasoning and semantic representation.

Abstract

Despite their remarkable natural language understanding capabilities, Large Language Models (LLMs) have been underutilized for retrieval tasks. We present Search-R3, a novel framework that addresses this limitation by adapting LLMs to generate search embeddings as a direct output of their reasoning process. Our approach exploits LLMs' chain-of-thought capabilities, allowing them to produce more effective embeddings by reasoning step-by-step through complex semantic analyses. We implement this through three complementary mechanisms. (1) a supervised learning stage enables the model's ability to produce quality embeddings, (2) a reinforcement learning (RL) methodology that optimizes embedding generation alongside reasoning, and (3) a specialized RL environment that efficiently handles evolving embedding representations without requiring complete corpus re-encoding at each training iteration. Our extensive evaluations on diverse benchmarks demonstrate that Search-R3 significantly outperforms prior methods by unifying the reasoning and embedding generation processes. This integrated post-training approach represents a substantial advancement in handling complex knowledge-intensive tasks that require both sophisticated reasoning and effective information retrieval. Project page: https://github.com/ytgui/Search-R3

Paper Structure

This paper contains 19 sections, 9 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of Search-R3.
  • Figure 2: Training pipeline of Search-R3.
  • Figure 3: System prompt in Stage 2.
  • Figure 4: Illustration of selective graph refresh mechanism.
  • Figure 5: Score distributions before and after RL training.