Table of Contents
Fetching ...

LORE: A Large Generative Model for Search Relevance

Chenji Lu, Zhuo Chen, Hui Zhao, Zhiyuan Zeng, Gang Zhao, Junjie Ren, Ruicong Xu, Haoran Li, Songyan Liu, Pengjie Wang, Jian Xu, Bo Zheng

TL;DR

LORE presents a principled, end-to-end framework for large generative model–based search relevance in e-commerce. By decomposing relevance into knowledge/reasoning, multimodal matching, and rule adherence, it provides a two-stage training (SFT then RL) and a dedicated RAIR benchmark to systematically evaluate these capabilities. The approach yields strong offline performance and a substantial online gain (27% in GoodRate) through a query-frequency-stratified deployment strategy. The work offers a practical blueprint for industrial-scale relevance systems and insights for post-training development in vertical domains.

Abstract

Achievement. We introduce LORE, a systematic framework for Large Generative Model-based relevance in e-commerce search. Deployed and iterated over three years, LORE achieves a cumulative +27\% improvement in online GoodRate metrics. This report shares the valuable experience gained throughout its development lifecycle, spanning data, features, training, evaluation, and deployment. Insight. While existing works apply Chain-of-Thought (CoT) to enhance relevance, they often hit a performance ceiling. We argue this stems from treating relevance as a monolithic task, lacking principled deconstruction. Our key insight is that relevance comprises distinct capabilities: knowledge and reasoning, multi-modal matching, and rule adherence. We contend that a qualitative-driven decomposition is essential for breaking through current performance bottlenecks. Contributions. LORE provides a complete blueprint for the LLM relevance lifecycle. Key contributions include: (1) A two-stage training paradigm combining progressive CoT synthesis via SFT with human preference alignment via RL. (2) A comprehensive benchmark, RAIR, designed to evaluate these core capabilities. (3) A query frequency-stratified deployment strategy that efficiently transfers offline LLM capabilities to the online system. LORE serves as both a practical solution and a methodological reference for other vertical domains.

LORE: A Large Generative Model for Search Relevance

TL;DR

LORE presents a principled, end-to-end framework for large generative model–based search relevance in e-commerce. By decomposing relevance into knowledge/reasoning, multimodal matching, and rule adherence, it provides a two-stage training (SFT then RL) and a dedicated RAIR benchmark to systematically evaluate these capabilities. The approach yields strong offline performance and a substantial online gain (27% in GoodRate) through a query-frequency-stratified deployment strategy. The work offers a practical blueprint for industrial-scale relevance systems and insights for post-training development in vertical domains.

Abstract

Achievement. We introduce LORE, a systematic framework for Large Generative Model-based relevance in e-commerce search. Deployed and iterated over three years, LORE achieves a cumulative +27\% improvement in online GoodRate metrics. This report shares the valuable experience gained throughout its development lifecycle, spanning data, features, training, evaluation, and deployment. Insight. While existing works apply Chain-of-Thought (CoT) to enhance relevance, they often hit a performance ceiling. We argue this stems from treating relevance as a monolithic task, lacking principled deconstruction. Our key insight is that relevance comprises distinct capabilities: knowledge and reasoning, multi-modal matching, and rule adherence. We contend that a qualitative-driven decomposition is essential for breaking through current performance bottlenecks. Contributions. LORE provides a complete blueprint for the LLM relevance lifecycle. Key contributions include: (1) A two-stage training paradigm combining progressive CoT synthesis via SFT with human preference alignment via RL. (2) A comprehensive benchmark, RAIR, designed to evaluate these core capabilities. (3) A query frequency-stratified deployment strategy that efficiently transfers offline LLM capabilities to the online system. LORE serves as both a practical solution and a methodological reference for other vertical domains.

Paper Structure

This paper contains 54 sections, 25 equations, 18 figures, 14 tables.

Figures (18)

  • Figure 1: Overview of the LORE theoretical framework and architecture.
  • Figure 2: A standard definition of the relevance task
  • Figure 3: Comprehensive Overview of Item Attributes
  • Figure 4: Comprehensive Overview of query understanding
  • Figure 5: Overview of the Relevance Discrimination Process: (1) Path Construction, (2) Path Following
  • ...and 13 more figures