CRM: Retrieval Model with Controllable Condition
Chi Liu, Jiangxia Cao, Rui Huang, Kuo Cai, Weifeng Ding, Qiang Luo, Kun Gai, Guorui Zhou
TL;DR
The paper addresses misalignment between retrieval and ranking in industrial recommender systems by introducing the Controllable Retrieval Model (CRM), which conditions the retrieval stage on regression targets such as watch time. CRM extends the two-tower retrieval framework with either a neural network or a Transformer-based design to incorporate regression signals during training, and employs online strategies—including max vs average watch time and time-division multiplexing—to set the conditioning during inference. Through large-scale online A/B tests in Kuaishou's short-video platform, CRM yields consistent improvements in engagement metrics and video watch time, and ablation studies show superiority over multiple retrieval baselines in terms of average time per video view. The results demonstrate that integrating regression-conditioned control into retrieval can align efficiently searched candidates with ranking objectives, offering practical gains for real-world, high-scale recommendation systems.
Abstract
Recommendation systems (RecSys) are designed to connect users with relevant items from a vast pool of candidates while aligning with the business goals of the platform. A typical industrial RecSys is composed of two main stages, retrieval and ranking: (1) the retrieval stage aims at searching hundreds of item candidates satisfied user interests; (2) based on the retrieved items, the ranking stage aims at selecting the best dozen items by multiple targets estimation for each item candidate, including classification and regression targets. Compared with ranking model, the retrieval model absence of item candidate information during inference, therefore retrieval models are often trained by classification target only (e.g., click-through rate), but failed to incorporate regression target (e.g., the expected watch-time), which limit the effectiveness of retrieval. In this paper, we propose the Controllable Retrieval Model (CRM), which integrates regression information as conditional features into the two-tower retrieval paradigm. This modification enables the retrieval stage could fulfill the target gap with ranking model, enhancing the retrieval model ability to search item candidates satisfied the user interests and condition effectively. We validate the effectiveness of CRM through real-world A/B testing and demonstrate its successful deployment in Kuaishou short-video recommendation system, which serves over 400 million users.
