QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

Jianzhao Huang; Xiaorui Huang; Fei Zhao; Yunpeng Liu; Hui Zhang; Fangcheng Shi; Congfeng Li; Zechen Sun; Yi Wu; Yao Hu; Yunhan Bai; Shaosheng Cao

QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

Jianzhao Huang, Xiaorui Huang, Fei Zhao, Yunpeng Liu, Hui Zhang, Fangcheng Shi, Congfeng Li, Zechen Sun, Yi Wu, Yao Hu, Yunhan Bai, Shaosheng Cao

TL;DR

QP-OneModel reformulates industrial SNS query processing as a unified sequence-generation task, addressing limitations of cascaded discriminative pipelines and domain-mismatched LLMs. It combines a domain-adaptive RedOne backbone with a progressive three-stage alignment (Knowledge Injection, Target Distribution Alignment, Multi-Reward RL) to internalize complex business rules and semantics. A novel output, Intent Descriptions, provides high-fidelity semantic signals that boost downstream tasks like query rewriting and ranking. Offline results show substantial gains over baselines and strong generalization to unseen tasks, while online A/B tests confirm improved retrieval relevance and user retention, demonstrating practical industrial impact.

Abstract

Query Processing (QP) bridges user intent and content supply in large-scale Social Network Service (SNS) search engines. Traditional QP systems rely on pipelines of isolated discriminative models (e.g., BERT), suffering from limited semantic understanding and high maintenance overhead. While Large Language Models (LLMs) offer a potential solution, existing approaches often optimize sub-tasks in isolation, neglecting intrinsic semantic synergy and necessitating independent iterations. Moreover, standard generative methods often lack grounding in SNS scenarios, failing to bridge the gap between open-domain corpora and informal SNS linguistic patterns, while struggling to adhere to rigorous business definitions. We present QP-OneModel, a Unified Generative LLM for Multi-Task Query Understanding in the SNS domain. We reformulate heterogeneous sub-tasks into a unified sequence generation paradigm, adopting a progressive three-stage alignment strategy culminating in multi-reward Reinforcement Learning. Furthermore, QP-OneModel generates intent descriptions as a novel high-fidelity semantic signal, effectively augmenting downstream tasks such as query rewriting and ranking. Offline evaluations show QP-OneModel achieves a 7.35% overall gain over discriminative baselines, with significant F1 boosts in NER (+9.01%) and Term Weighting (+9.31%). It also exhibits superior generalization, surpassing a 32B model by 7.60% accuracy on unseen tasks. Fully deployed at Xiaohongshu, online A/B tests confirm its industrial value, optimizing retrieval relevance (DCG) by 0.21% and lifting user retention by 0.044%.

QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

TL;DR

Abstract

Paper Structure (41 sections, 12 equations, 3 figures, 7 tables)

This paper contains 41 sections, 12 equations, 3 figures, 7 tables.

Introduction
Related Work
Query Processing in Industrial Search
LLMs for Information Retrieval
Unified Generative Modeling and Alignment
Preliminaries
Methodology
Query Processing as Unified Sequence Generation
Business-Aware Prompt Design
Progressive Three-Stage Alignment Strategy
Knowledge Injection via Task Decomposition and Mixed-SFT
Target Distribution Alignment
Logic Internalization via Multi-Reward RL
Reward Design.
Group Relative Policy Optimization.
...and 26 more sections

Figures (3)

Figure 1: The overall framework of QP-OneModel, covering data construction, multi-stage SFT, and reinforcement learning.
Figure 2: Illustration of the Business-Aware Prompt schema. The prompt $\mathcal{P}$ integrates configurable business rules $R$ with dynamic contexts, including user rewrite history $C_{\text{hist}}$ and candidate notes $C_{\text{note}}$. This context-rich formulation guides the model to generate a unified JSON output covering all QP sub-tasks.
Figure 3: Overview of the deployment architecture. The framework utilizes a nearline inference strategy where QP-OneModel pre-computes results to update the KV-Cache daily. The retrieved structural signals and intent descriptions are then served to downstream tasks such as Query Rewriting and Ranking.

QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

TL;DR

Abstract

QP-OneModel: A Unified Generative LLM for Multi-Task Query Understanding in Xiaohongshu Search

Authors

TL;DR

Abstract

Table of Contents

Figures (3)